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ERCDUCTION OF FACTOR VIII AND RELATED PRODUCTS 
PHpr Application 

•This application is a continuation-in-part application of 
U.S. Application Serial No. 546,650 filed on October 28, 1983„ 

paekcround of the Inven tion, 

This invention relates to the preparation of recombinant 
deoxyribonucleic acid (DNA) which codes for cellular production of 
human factor VIII :C, and of DNA which codes for porcine factor 
VIII :C, to methods of obtaining DNA molecules which code for 
factor VIII t C, and to expression of human and porcine factor 
VIII sC utilizing such DNA, as well as to novel compounds, 
including deoxyribonucleotides and ribonucleotides utilized in 
obtaining such clones and in achieving expression of human factor 
VIII : Co This invention also relates to human AHF and its 
production by recombinant DNA techniques. 

Factor VIII sC is a blood plasma protein that is defective 
or absent in Hemophilia A disease. This disease is a hereditary 
bleeding disorder affecting approximately one in 20,000 males. 
Factor VIII sC has also been known or referred to as factor VIII, 
the antihemophilic factor (AHF) , antihemophilic globulin CAHG) , 
hemophilic factor A, platelet cofactor, thromboplastinogen, and 
thrombocytolysin. It is referred to as "Factor VIII :C", to 
indicate that it is the compound which affects clotting activity. 
As used herein, "factor VIII sC H and "AHF" are synonymous. 

Although the isolation of AHF from blood plasma has been 
described in the literature, the precise structure of AHF has not 
previously been identified, due in part to the unavailability of 
sufficient quantities of pure material, and the proteolytic nature 
of many contaminants and purification agents. While some 
quantities of impure AHF have been available as a concentrated 
preparation processed from fresh-frozen human plasma, the 
extremely low concentration of AHF in human plasma and the high 
cost of obtaining and processing human plasma make the cost of 
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this material prohibitive for any extensive treatment of 
hemophilia. 

The present invention makes it possible to produce human 
AHF using recombinant ENA techniques. 

AHF, like other proteins, is comprised of some twenty- 
different amino acids arranged in a specific array. By using gene 
manipulation techniques , a method has been developed which enables 
production of AHF by identifying and cloning the gene which codes 
for the human AHF protein, cloning that gene, incorporating that 
gene into a reccnkinant DMA vector, transforming a suitable host 
with the vector which includes that gene, expressing the human AHF 
gene in such host, and recovering the human AHF produced thereby. 
Similarly, the present invention makes it possible to produce 
porcine AHF by recombinant DNA techniques, as well as providing 
products and methods related to such porcine AHF production. 

Recently developed techniques have made it possible to 
enploy microorganisms, capable of rapid and abundant growth, for 
the synthesis of conroercially useful proteins and peptides, 
regardless of their source in nature. These techniques make it 
possible to genetically endow a suitable microorganism with the 
ability to synthesize a protein or peptide normally made by 
another organism. The technique makes use of fundamental 
relationships which exist in all living organisms between the 
genetic material, usually DNA, and the proteins synthesized by the 
organism. This relationship is such that production of the amino 
acid sequence of the protein is coded for by a series of three 
nucleotide sequences of the ENA. There are one or more 
trinucleotide sequence groups (called codons) which specifically 
code for the production of each of the twenty amino acids roost 
ccHTinonly occurring in proteins. The specific relationship between 
each given trinucleotide sequence and the corresponding amino acid 
for which it codes constitutes the genetic code. As a 
consequence, the amino acid sequence of every protein or peptide 
is reflected by a corresponding nucleotide sequence, according to 
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a well understood relationship. Furthermore, this sequence of 
nucleotides can, in principle, be translated by any living 
organism. For a discussion of the genetic code, see J. D. Watson, 
r im h^ inmnay ^^ftne. CH. A. Benjamin, Inc., 1977), the 
disclosure of which is incorporated herein by reference, 
particularly at 347-77; C. P. Norton, H lr mMrtOTY (Addison, 
Wesley 1981) , and O. S. Patent No. 4,363,877, the disclosure of 
which is incorporated herein by reference. 

Hie twenty amino acids from which proteins are made, are 
phenylalanine (hereinafter sometimes referred to as "Phe" or "F") , 
leucine ("Leu", "L") , isoleucine ("lie", "I"), methionine ("Met , 
«M»), valine ("Val", "V") , serine ("Ser", "S") , proline ("Pro", 
-P->, threonine ("Thr", "T") , alanine ("Ala", "A"), tyrosine 
« Y « )t histidine ("His", "H") , glutamine ("Gin", "Q") r 
asparagine ("Asp", "N") , glutamic acid ("Glu", "E") , cysteine 
("Cys", "C"), tryptophane ("Trp", -W> , arginine ("Arg", -R-) and 
glycine ("Gly", "G") . The amino acids coded for by the various 
combinations of trinucleotides which may be contained in a given 
codon may be seen in Table 1: 
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TABLE 1 

The Genetic Code 
First Second Position Third 



Position 
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Position 
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*The "Stop" or termination coclon terminates the expression of the 
protein. 
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Knowing the <feoxyribonucleotide sequence of the gene or 
Eta sequence which codes for a particular protein allows the exact 
description of that protein's amino acid sequence,, However, the 
converse is not true? while methionine is coded for by only one 
codon, the other amino acids can be coded for by up to six codons 
(e 0 g 0 serine) , as is apparent from Table L Thus there is 
considerable ambiguity in predicting the nucleotide sequence f rem 
the amino acid sequence • 

In sum, prior to the present invention, very little was 
known about the structure of AHF, and, despite s u bstantial work 
over many years, those skilled in this art were unable to 
determine the structure of AHF, or of its gene, or provide any 
procedure by which AHF could be produced in substantially pure 
form in substantial quantities * 

The method described herein by which the gene for human 
AHF is cloned and expressed includes the following steps t 

(1) purification of porcine AHF? 

(2) Determination of the amino acid sequence 
of porcine AHF; 

(3) Formation of oligonucleotide probes, and 
use of those probes to identify and/or isolate 
at least a fragment of the gene which codes 
for porcine AHF; 

(4) Use of the porcine AHF gene fragment to 
identify and isolate human genetic material 
which codes for human AHF; 

(5) Using the previously described AHF DNA 
fragments to determine the site of synthesis 
of AHF from among the various mammalian 
tissues; 

(6) Producing cDtS^ segments which code for 
human and porcine AHF, using messenger RNA 
obtained from the tissue identification in 
step 5; 
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(7) Constructing full length human and 
porcine cDNA clones from the cDNA segments 
produced in step^ 6, e.g. by ligating together 
cENA segments which were cut by the same 
restriction enzymes; 

(8) Forming DNA egression vectors which are 
capable of directing the synthesis of AHF; 

(9) Transforming a suitable host with the 
expression vectors bearing the full length 
cENA for human or porcine AHF; 

(10) Expressing human or porcine AHF in the 
host; and 

(ID Recovering the expressed AHF. 

In the course of this work, a new technique of screening 
a genomic DNA library has been developed utilizing 
oligonucleotide probes based on the amino acid sequences 
contained in the AHF molecule. 

The invention includes the above methods r along with the 
various nucleotides, vectors, and other products made in 
connection therewith. 

Brief Surcnary of Tte Drawings 

Figure 1 is a depiction of the amino acid sequence (A) 
determined for the amino terminal sequence of the 69,000 dalton 
thranbin cleavage product described in Example 1. The first 
residue was not identifiable. Uiis sequence is compared with 
sequence (B) deduced from the nucleotide sequence of the porcine 
AHF exon described in Example 3, and with the human AHF exon 
sequence (C) described in Example 4. 

Figure 2 illustrates the amino acid sequence, shown in 
single letter code, of bovine thranbin digestion fragments of (A) 
the amino terminus and (B) a 40 Kd thrombin cleavage product of 
the 16S Kd porcine AHF fragment isolated by Fass et al. , infra . 

Figure 3 illustrates the design of an oligonucleotide 
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probe for the identification and isolation of at least a portion 
of the porcine gene which codes for AHF © 

Figure 4 illustrates the ENA sequence for a Sna I DNA 
fragment (34-Sl) which contains a porcine AHF exon, derived from 
bacteriophage PB34 described in Exanple 3o 

Figure 5 is a representation of the DNA sequence of the 
Hae III insert 34-ffL bearing the exon for porcine AHF, as 
described in Exanple 3o This sequence is included within the 
longer sequence shown in Figure 4 and corresponds to nucleotides 
250-615 of the sequence in Figure 4„ 

Figure 6 is a representation of the DNA sequence along 
with the deduced amino agid sequence for nucleotides 34-84 for a 
portion of the Sau 3AI insert of clone 25-Sl r showing a portion 
of the exon for human AHF/ as described in Exanple 4« 

Figure 7 illustrates the DNA nucleotide sequence (shown 
in one strand only) which contains the entire sequence coding for 
human AHF, along with the deduced amino acid sequence for human 
AHFo 

Detailed Description of the HWftHtAQn 

The following definitions are supplied in order to 
facilitate the understanding of this case« To the extent that 
the definitions vary from meanings circulating within the art, 
the ^definitions below are to control „ 

Aniplif ication means the process by which cells produce 
gene repeats within their chromosomal DMA* 

Ootransformation means the process of transforming a cell 
with more than one exogenous gene foreign to the cell, one of 
which confers a selectable phenotype on the cell* 

Downstream means the direction going towards the 3 f end 
of a nucleotide sequence* 

An enhancer is a nucleotide sequence that can potentiate 
the transcription of genes independent of the identity of the 
gene, the position of the sequence in relation to the gene, or 
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the orientation of the sequence. 

A gene is a deoxyribonucleotide sequence coding for a 
given nature protein. For the purposes herein, a gene shall not 
include untranslated flanking regions such as ENA transcription 
initiation signals, polyadenylation addition sites, promoters or 
enhancers. 

A selection gene is a gene that confers a phenotype on 
cells which express the gene as a detectable protein* 

A selection agent is a condition or substance that 
enables one to detect the expression of a selection gene. 

Phenotype means the observable properties of a cell as 
expressed by the cellular genotype. 

Genotype means the genetic information contained within a 
cell as opposed to its expression, which is observed as the 
phenotype. 

Ligation is the process of forming a phosphodi ester bond 
between the 5' and 3 1 ends of two ENA strands. This may be 
accomplished by several well, known enzymatic techniques, 
including blunt end ligation by T4 ligase. 

Orientation refers to the order of nucleotides in a DNk 
sequence. An inverted orientation of a DNA sequence is one in 
which the 5 ! to 3 1 order of the sequence in relation to another 
sequence is reversed when compared to a point of reference in the 
DNA^f rem which the sequence was obtained. Such points of 
reference can include the direction of transcription of other 
specified DNA sequences in the source DNA or the origin of 
replication of replicable vectors containing the sequence. 

Transcription means the synthesis of RNA from a DNA 
template. 

Transformation means changing a cell's genotype by the 
cellular uptake of exogenous DN&. Transformation may be detected 
in some cases by an alteration in cell phenotype. Transformed 
cells are called transformants. Pre- transformation cells are 
referred to as parental cells. 
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Translation means the synthesis of a polypeptide from 

messenger BNA (mRNA) . 

She present invention permits, for the first time, the 
identification and isolation of porcine Factor VIII :C gene, by 
recombinant DNA techniques. It also permits, for the first time, 
the isolation and identification of the gene which encodes human 
factor VIII:C by recombinant DMA techniques, by taking advantage 
of the homology between porcine AHF DNA and human AHF DNA in 
order to locate and isolate the human gene for AHF. Shis route 
to cDNA clones for producing AHF via homology with the porcine 
AHF gene avoids the tedious time consuming and expensive need to 
purify human AHF, which is highly expensive and essentially 
unavailable « 

Porcine AHF is first highly. purified, most preferably by 
a monoclonal antibody purification technique such as that 
disclosed by David Fass et al, "Monoclonal Antibodies to Porcine 
AHF coagulant and Their use in the Isolation of Active Coagulent 
Protein", SlSSd 59: 594-600 (1982), and Knutsen et al„, "Porcine 
Factor VTII:C prepared by affinity Interaction with Von 
Willebrand Factor and Heterologous Antibodies," glasS, 59: 615-24 
(1982) , the disclosures of both of which are incorporated herein 
by reference. Porcine factor VTIIsC polypeptides are bound to 
anti-VIII:C monoclonal antibodies which are immobilized on a 
suitable affinity chromatography column. Two large molecular 
weight polypeptides, having molecular sizes of about 166 and 130 
Kd, are eluted with ethylenediamine tetraacetic acid. Another 
protein segment having a molecular weight of about 76 Kd f is then 
eluted from the column, utilizing about 50% ethylene glycol. The 
partial amino acid sequences of these polypeptides, and/or of 
fragments of these polypeptides obtained in a known manner, e.g. 
from enzymatic digestion of the proteins, using bovine thrombin 
or other suitable agents to break up the proteins, are then 
determined by known methods of analysis. Eased on the amino acid 
sequence of these materials, oligonucleotide probes are 
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synthesized, at least sane of which will hybridize with ENA 
segments which code for the corresponding segment of AHF. These 
oligonucleotides are then used to screen for segments of the gene 
which code for porcine AHF. 

Qice a portion of the porcine gene for AHF is obtained, 
that recombinant material is used to screen a human genomic ENA 
library to locate and isolate the gene which codes for human AHF. 
In this procedure, it is established that there are substantial 
similarities between human AHF and porcine AHF, and advantage is 
taken of those similarities to isolate and identify the human 
factor VIII :C gene or gene segments. 

.The similarities in the porcine and human proteins are 
attributable to corresponding similarities in the DNA sequences 
which code for the amino acid sequences. The genetic materials 
coding for the AHF proteins in humans and pigs are identical at a 
high percentage of positions, and thus exhibit hybridization when 
subjected to the procedure of, for example, Benton and Davis, 
"Screening gt Recombinant Clones by Hybridization to Single 
Plaques In Situ," Science , 196:180 (1977), the disclosure of 
which is incorporated herein by reference. 

The cloned segments of the gene for human AHF are then 
used to identify a source for AHF mlSIA f rem among various human 
tissues. Similarly, one or more of the cloned segments for 
porcine AHF are used to screen potential tissue sources for 
porcine mRNA. This step is accomplished by conventional 
procedures involving RNA extraction, gel electrophoresis, 
transfer to nitrocellulose sheets, and hybridization to 
radiolabelled cloned DMA as described for example in Maniatis, 
et. al.. Molecular Cloning, ft T^frnpfr ory Manual , (Cold Spring 
Harbor Laboratory, 1982) . dice this tissue source is identified, 
cDNA libraries are prepared (in either plasmid or preferably 
bacteriophage lambda vectors, both of which are generally 
available) and screened for AHF cDNA clones as described below. 
The process of cDNA library screening is repeated until a set of 
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cdna clones is obtained which, by DNA sequence determintion, is 
shown to comprise the entire DNA gene sequence encoding AHF. 

dice the full length human or porcine AHF cDNA clone is 
obtained, known and appropriate means are utilized to express the 
AHF protein, e.g. insertion into an appropriate vector, and 
transfecticn into an appropriate host, selection of transformed 
cells (transformants) , and culture these transformants, to 
express AHF activity. 

Host-vector systems for the expression of AHF may be 
procaryotic, but the complexity of AHF makes the preferred 
expression system eucaryotic, preferably (at least for 
biologically-active AHF having clotting activity) a . mammalian 
one. This is easily accomplished by eucaryotic (usually- 
mammalian or vertebrate cells) transformation with a suitable AHF 
vector. Eucaryotic transformation is in general a well-known 
process, and may be accomplished by a variety of standard 
methods. These include the use of protoplast fusion, DNA 
microinjection, chromosome transf ection, lytic and noniytic viral 
vectors (For example, Mulligan et al., "Nature" (London) 
277:108-114 (1979), cell-cell fusion (Fournier et al., "Proc. 
Nat. Acad, sci." 21:319-323 (1977), lipid structures (U.S. patent 
4,394,448) and cellular endocytosis of DNA precipitates (Bachetti 
et al., "Proc. Nat. Acad. Sci. 74:1590-1594 (1977). Other 
eucaryotic cells, such as yeasts or insect cells, may also be 

used to advantage. 

Transformation which is mediated by lytic viral vectors 
is efficient but is disadvantageous for a number of reasons: The 
maximum size of transf ected DNA is limited by the geometry of 
viral capsid packing, the exogenous genes are frequently deleted 
during viral replication, there is a requirement for helper virus 
or specialized hosts, host cells must be permissive, and the 
hosts are killed in the course of the viral infection. 

Noniytic transformations are based on the transcription 
and translation of virus vectors which have been incorporated 
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into a cell line as a stable episome. These systems generally 



disadvantages. See "Trends in Biochemical Sciences" , June 1983 , 
pp. 209-212. 

Cn the other hand, other transformation methods in which 
extrachrcmosonal DNA is taken up into the chromosomes of host 
cells have been characterized by low frequencies of 
transformation and poor expression levels, Ihese initial 
difficulties were ameliorated by transformation with genes which 
inheritably confer selectable pheno types on the small 
subpopulation of cells that are in fact transformed (selection 
genes) . She entire population of transformed cells can be grown 
under conditions favoring cells having acquired the pheno type, 
thus making it possible to locate transformed cells conveniently, 
Thereafter/ transformants can be screened for the capability to 
more intensely express the pheno type. This is accomplished by 
changing a selection agent in such a way as to detect higher 
expression. 

Selection genes fall into three categories: Detectably 
amplified selection genes, dominant selection genes, and 
detectably amplified dominant selection genes. 

Detectably amplified selection genes are those in which 
amplification can be detected by exposing host cells to changes 
in the selection agent. Detectably amplified genes which are not 
dominant acting generally require a parental cell line which is 
genotypically deficient in the selection gene. Examples include 
the genes for bydroxymetholglutanyl CoA reductase (Sinensky, 
"Biochem. Biophys. Res. Ccsnnun" IB.: 863 (1977), ribonucleotide 
reductase (Meuth et al. "Cell:i:367 (1943) , aspartate 
transcarbamylase; (Kemp et al. "Cell" £:541 (1976), adenylate 
deaminase (DeBatisse et al. "Mol and Cell Biol." 2.(11) : 1346-1353 
(1982) mouse dihydrofolate reductase (DHFR) and, with a defective 
promoter, mouse thymidine kinase (TK) • 



Dominant selection genes are those which are expressed in 



require unique cell lines and suffer from a number of 




BNSDOCID: <WO. 



B501961A1 J__> 



WO 85/01961 



PCT/US84/01641 



13 



transform regardless of the genotype of the P"-**" 11 - 
»st dominant selection genes are not detects amplified 
Lause the phenotyp* is so highly effective in dealing with the 
selection agent that it is difficult to discriininate among cell 
lines that have or have not amplified the gene. Examples of 
dominant selection genes of this type include the genes for 
procaryotie enzymes such as xanthine-guanine 
phosphoribosyltransf erase (Mulligan et al. "Proc. fflt. Mad. 
Sci." TSUI: 2072-2076 C1981) and aminoglycoside 3 1 - 
Phosphotransferase (Colbere-orapin et al., "J. Hoi. Biol. , 

Same dominant selection genes also are detectably 
amplified, suitable exiles include the mutant ^ ^ 
described by Baber et al., "Somatic Cell Genets 4*499-508 
(1982) , cell surface markers such as HLA antigens and genes 
coding for enzymes such as specific esterases that produce 
fluorescent or colored products from fluorogenic or chromogenxc 
substrates as is known in the art. 

Detectably-amplified, dominant selection genes are 
preferred for use herein. It should be understood that a 
dominant selection gene in some cases can be converted to a 
detectably amplified gene by suitable mutations in the gene. 

selection genes at first were of limited commercial 
utility. While they enabled one to select transf ormants havxng 
the propensity to amplify uptaken DNA, most selection genes 
produced products of no commercial value. On the other hand, 
genes for products which were commercially valuable generally dxd 
not confer readily selectable (or even detectable) phenotypes on 
their transforms. This would be the case, for example, with 
enzymes or hormones which do not provide transformed cells wxth 
unique nutrient metabolic or detoxification capabilities. Most 
• proteins of commercial interest fall into this group, e.g. 
hormones, protein* participating in blood coagulation and 
fibrinolytic enzymes ♦ 
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Subsequently it was found that eucaryotic cells having 
the propensity to be transformed with and amplify the selection 
gene would do the same in the case of the product gene. By 
following the selection gene one could identify a subpopulation 
of transf onnant cells which coexpress and coamplify the product 
gene along with the selection gene. It has been the practice to 
culture the transf ormants in the presence of the selection agent 
and to conclude that transf ormants having increased expression of 
the selection gene will also show increased expression of the 
product gene. Bus is not always the case, as will be more fully 
explored below. Axel et al. (U.S. patent 4,399,216) use the term 
cotransf ormation to describe the process of transforming a cell 
with more than one different gene, whether by vector systems 
containing covalently linked or unlinked genes, and in the latter 
case whether the genes are introduced into host cells 
sequentially or simultaneously. Cotransf ormation should "allow 
the introduction and stable integration of virtually any defined 
gene into cultured cells" CWigler et al. "Cell", 1£: 777-7 85, 
(1975), and "by use of the cotransf ormation process it is 
possible to produce eucaryotic cells which synthesize desired 
proteinaceous and other materials" (U.S. patent 4,399,216, column 
3, lines 37-42) . 

Trereforretipn Vectors 

Vectors used in AHF cotransf ormation will contain a 
selection gene and the AHF gene. In addition there usually will 
be present in the transformation or cotransf ormation vectors 
other elements such as enhancers, promoters, introns, accessory 
ENA, polyadenylation sites and 3' noncoding regions as will be 
described below ♦ 

Suitable selection genes are described above. It is 
preferred that the selection agent be one that prevents cell 
growth in the absence of the selection gene. That way, revertant 
cells in large scale culture that lose the selection gene (and 
presumably the AHF gene as well) will not over-grew the 
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fermentation. However, it would be desirable in the commercial 
production of AHF to avoid the use of cell toxins, thereby 
simplifying the product purification steps. Thus, a desirable 
selection gene would be one that enables transf onnants to use a 
nutrient critical for growth that they otherwise would not be 
able to use. The TK gene described above is an example. 

Two classes of vectors have been employed in 
^transformation. The first class are the unlinked vectors. 
Here the selection gene and the AHF gene are not covalently 
bound. This vector class is preferred because the step of 
ligating or otherwise bonding the two genes is not required. 
This simplifies the transformation process because the selection 
and product genes usually are obtained from separate sources and 
are not ligated in their wild-type environment. In addition, the 
molar ratio of the AEF and selection genes employed during 
^transformation can be adjusted to increase ^transformation 

efficiency. . 

• The second class of cotransf ormation vectors are linked 
vectors. These vectors are distinguished from unlinked vectors 
in that the selection and AHP genes are covalently bound, 

preferably by ligation. 

The vectors herein may also include enhancers. Enhancers 
are functionally distinct from promoters, but appear to operate 
in ^ncert with promoters. Their function on the cellular level 
is not well understood, but their unique characteristic is the 
ability to activate or potentiate transcription without being 
position or orientation dependent. Promoters need to be upstream 
of the gene, while enhancers may be present upstream or 5 from 
' the promoter, within the gene as an intron, or downstream from 
the gene between the gene and a polyadenylation site or 3' from 
the polyadenylation site. Inverted promoters are not functional, 
but inverted enhancers are. Enhancers are cis-acting, i.e., they 
have an effect on promoters only if they are present on the same 
DNA strand. For a general discussion of enhancers see Khoury et 
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al., "Cell" 31:313-314 (1983), 

Preferred enhancers are obtained f ran animal viruses such 
as simian virus 40,, polyoma virus, bovine papilloma virus, 
retrovirus or adenovirus. Viral enhancers may be obtained 
readily from publically available viruses. The enhancer regions 
for several viruses, e.g., Rous sarcoma virus and simian virus 
40, are well known. See Luciw et al., "Cell" 31:705-716 (1983) . 
It would be a natter of routine chemistry to excise these regions 
on the basis of published restriction maps for the virus in 
question and, if necessary, modify the sites to enable splicing 
the enhancer into the vector as desired. For example, see 
Kaufman et al, "J. Mol. Biol.", 601-621 (1982) and "Mol. Cfell 
Biol." 2<11):1304-1319 (1982) the disclosures of both of which 
are incorporated herein by reference. Alternatively, the 
enhancer may be synthesized from sequence data; the sizes of 
viral enhancers (generally less than about 150 bp) are 
sufficiently snail that this could be accomplished practically. 

Another elenent which should be present in the "vector 
assembly is a polyadenylation splicing (or addition) site. This 
is a DNA sequence located downstream from the translated regions 
of a gene, shortly downstream from which in turn transcription 
stops and adenine ribonucleotides are added to form a polyadenine 
nucleotide tail at the 3 1 end of the messenger RNA. 
Polyadenylation is important in stabilizing the messenger RNA 
against degradation in the cell, an event that reduces the level 
of messenger ENA and hence the level of product protein. 

Eucaryotic polyadenylation sites are well known. A 
concensus sequence exists among eucaryotic genes: The 
hexanucleotide S'-AADAAA-S' is found 11-30 nucleotides from the 
point at which polyadenylation starts. ENA sequences containing 
.polyadenylation sites may be obtained from viruses in accord with 
published reports. Exemplary polyadenylation sequences can be 
obtained from mouse beta-globulin, and simian virus 40 late or 
early region genes, but viral polyadenylation sites are 
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preferred., Since these sequences are known, they may be 
synthesized in and ligated to the vectors in conventional 

fashion., 

A polyadenylation region must be located downstream from 
either the AHF and/or selection gene, but may be. ligated to 
either gene. It may be ligated to the selection gene only, and 
not the product gene, and this will be the case whether the 
vectors are linked or unlinked. The sequence which separates the 
polyadenylation site from the translation^ stop oodon is 
preferably an untranslated DNA oligonucleotide such as an 
unprcmoted eucaryotic gene., Since such oligonucleotides and 
genes are not endowed with a promoter they will not be expressed. 
The oligonucleotide should extend for a considerable distance, on 
the order of up to about 1,000 bases, from the stop codon to the 
polyadenylation site. This 3' untranslated oligonucleotide 
generally results in an increase in product yields. The vector 
may terminate from about 10 to about 30 bp downstream from the 
concensus sequence, but it is preferable to retain the 3' 
sequences found downstream from the polyadenylation site in its 
wild-type environment. These sequences typically extend about 
from 200 to 600 base pairs downstream from the polyadenylation 
site. 

Ike vectors described herein may be synthesized by 
techniques well known to those skilled in this art. The 
components of the vectors such selection genes, enhancers, 
promoters, and the like may be obtained from natural sources or 
synthesized as described above. Basically, if the components are 
found in DNA available in large quantity, e.g. components such as 
viral functions, or if they may be synthesized, e.g. 
polyadenylation sites, then with appropriate use of restriction 
enzymes large quantities of vector may be obtained by simply 
culturing the source organism, digesting its DNA with ah 
appropriate endbnuclease , separating the DSffi. fragments, 
identifying the DNA containing the element of interest and 
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recovering same. Ordinarily, a transformation vector will be 
assmbled in snail quantity and than ligated to a suitable 
autonomously replicating synthesis vector such as a procaryotic 
plasmid or phage. She pBR322 plasmid may be used in roost cases. 
See Kaufman et al. , fip. 

The synthesis vectors are used to clone the ligated 
transformation vectors in conventional fashion, e.g. by 
transfection of a permissive procaryotic organsim, replication of 
the synthesis vector to high copy number and recovery of the 
synthesis vector by cell lysis and separation of the synthesis 
vector from cell debris. 

Hie resulting harvest of synthesis vector may be directly 
transf ected into eucaryotic cells, or the transformation vector 
may be rescued from the synthesis vector by appropriate 
endonuclease digestion, separation by molecular weight and 
recovery of the transformation vector. Transformation vector 
rescue is not necessary so long as the remainder of the synthesis 
vector does not adversely affect eucaryotic gene amplification, 
transcription or translation. For example, the preferred 
synthesis vector herein is a mutant of the E, CPli plasmid pBR322 
in which sequences have been deleted that are deleterious to 
eucaryotic cells. See Kaufman et al., s£. jcit. Use of this 
mutant obviates any need to delete the plasmid residue prior to 
cotr ansf ormation . 

m^ransfo r n ^ f^n. Selection and DetectiPn Qf ftTTPlificaUpn 

The cells to be transformed may be any eucaryotic cell, 
including yeast protoplasts, but ordinarily a nonfungal cell. 
Primary explants (including relatively undifferentiated cells 
such as stem cells) , and irranortal and/or transformed cell lines 
are suitable. Candidate cells need not be genotypically 
deficient in the selection gene so long as the selection gene is 

dominant acting. 

The cell* preferably will be stable mammalian cell lines 
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as is discussed above » Cell lines that are known to stably 
integrate selection genes into their chromosomal ENA are best, 
for example Chinese hamster ovary (CEO) cell lines. Also useable 
are HeLa, COS monkey cells, melanoma cell lines such as the Bowes 
cell line, mouse L cells, mouse fibroblasts and mouse NIH 3T3 
cells. 

(^transformation with unlinked vectors may be 
accomplished serially or simultaneously, (see U.S. patent 
4,399,216) . Methods for facilitating cellular uptake of ENA are 
described above. Microinjection of the vector into the cell 
nucleus will yield the highest transformation efficiencies, but 
exposing parental cells to DNA in the form of a calcium phosphate 
precipitate is most convenient. Considerably better 
cotransf onnation efficiencies result from cotransf ormation with a 
molar excess of product to selection gene, on the order of 100:1. 

The population of cells that has been exposed to 
transforming conditions is then processed to identify the 
transformants. Only a small subpopulation of any culture which 
has been treated for cotransformation will exhibit the phenotype 
of the selection gene. The cells in the culture are screened for 
the phenotype. This can be accomplished by assaying the cells 
individually with a cell sorting device where the phenotype is 
one that will produce a signal, e.g. fluorescence upon cleavage 
of a fluorogenic substrate by an enzyme produced by the selection 
gene. Preferably, however, the phenotype enables only 
transformants to grow or survive in specialized growth media as 
is further discussed above. 

Selection transformants then will be screened for 
ligation of the product gene into their chracosomes or for 
expression of the product itself. The former can be accomplished 
using Southern blot analysis, the latter by standard 
ijnmunological or enzymatic assays. 

Once the tranformants have been identified, steps are 
taken to amplify expression of the product gene by further 
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cloning in the presence of a selection agent such as MIX. See 
U.S. Patent 4,399/216, 

Cotransf ormants which my be produced in accordance with 
the processes described herein are suitable for ip vivo 
transf ections of higher organisms in accordance with known 
techniques. Primary explants or stable cell lines frcsn a 
potential host animal are cotransf ormed and inoculated into the 
host or a substantially otherwise syngeneic host which is 
genotypically deficient in the product protein. 

The invention will be further understood with reference 
to the following illustrative embodiments, which are purely 
exerplary, and should not be taken as limitive of the true scope 
of the present inevntion, as described in the claims. 

Unless otherwise noted, restriction endonucleases are 
utilized under the conditions and in the manner recommended by 
their commercial suppliers. Ligation reactions are carried on as 
described by Maniatis et al., Molecular Cloning, ^ y^ra^ry 
Manual, (Cold Spring Harbor Laboratory 1982) at 245-6, ±he 
disclosure of which is incorporated herein by reference, using 
the buffer described at page 246 thereof and using a DNA 
concentration of 1-100 ug/ml, at a tanperature of 23°C for blunt 
ended DNA and 16°C for "sticky ended" DNA. "Phosrhatasing w as 
described herein, refers to dephosphorylation of im, and is 
carried out in the manner described by Maniatis et al., supra . 
e # g. at page 133 et seq. "Kinasing" refers to phosphorylation of 
DNA. Electrophoresis is done in 0 •5-1*5% Agarose gels containing 
90 mM Tris-borate, 10 mM EDIA. "Nick-translating" refers to the 
method for labeling DNA with 32p as described by Sigby et al. , J. 
Mol. Biol, . 113:237 (1977) . All radiolabeled DNA is labeled with 
32p, whatever labeling technique was used* 

By "rapid prep" is meant a rapid, small scale production 
of bacteriophage or plasmid DNA, e.g., as described by Maniatis 
et al., suEES/ at p. 365-373. 

In accordance with another aspect of this invention. 
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there is provided a pharmaceutical preparation of AHF. 
pharmaceutical preparation of human AHF produced in accordance 
with this invention may be prepared for parenteral administration 
in accordance with procedures well known in the art. 

The pharmaceutical preparation for human use comprises 
sterilized AHF recovered from transformed cells. In addition to 
the AHF polypeptide or sufficient portion thereof which produces 
AHF activity, there may be included one or more acceptable 
carriers therefor and optionally other therapeutic ingredients. 
The carrier (s) must be "acceptable" in the sense of being 
compatible with other ingredients of the preparation and not 
deleterious to the recipient thereof. The preparation may 
conveniently be presented in unit dosage form and may be prepared 
by any of the methods well known in the art of pharmacy. 

The pharmaceutical preparation of this invention, 
suitable for parenteral administration, may conveniently comprise 
a sterile lyophilized preparation of the AHF polypeptide which 
may be reconstituted by addition of sterilized solution to 
produce solutions preferably isotonic with the blood of the 
recipient. The preparation may be presented in unit or 
multi-dose containers, for example sealed ampoules or vials. 

It should be understood that in addition to the sterile 
AHF or solution thereof, the pharmaceutical preparation of this 
invention may include one or more additional ingredients such as 
diluents, buffers, binders, surface active agents, thickeners, 
lubricants, preservatives (including antioxidants) and the like. 
Gelatine, lactose, starch, magnesium sterate, micronized silica 
gel, cocoa butter, talc, vegetabilic and animalic fats and oils, 
vegetabilic rubber and polyalkylene glycol and other known 
carriers for pharmaceuticals are all suitable for manufacturing 
the pharmaceutical preparations of the present invention. 
Preparations for parenteral use include an ampoule of a sterile 
solution or suspension with water or other pharmaceutically 
acceptable liquid as the carrier therefor, or an ampoule of 
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sterile solid AHP for dilution with a pharmaceutical^ acceptable 



The pharmaceutical preparation of the present invention 
is useful in the treatment of Hemophilia A. These preparations 
also provide an important tool in the in vivo as well as the in 
vitro study of the processes involved in clotting and in the 
study of the irrmunulogical and biological characteristics of the 
AHF molecule * 

Example 1. Protein Sequence analysis 

Porcine factor VIII :C was purified by Dr. Bavid Fass 
according to published procedures f Knutsen and Fass (1982) ; Fass 
et al. (1982) , su pra . Amino acid sequence analysis is performed 
on a bovine thrombin digest product of the 76,000 dalton protein 
as described below. 

Porcine AHF polypeptides bound to the anti-VIII:C 
monoclonal antibody column, can be sucessively eluted in two 
steps (Fass et.al. 1982) • The two larger molecular weight 
species, 166,000 and 130,000 da 1 tons / can be eluted with EDTA. 
The remaining polypeptide, having a molecular weight of about 
76,000 daltons, can then be eluted with 50% ethylene glycol. 

The 76,000 dalton protein is digested with thrombin after 
extensive dialysis in 50 mM Tris-HCl (pH 7.5) , 0.15 Nad. 
Bovine thrombin digests are performed at room temperature for 60 
minutes using 1 unit/ml of bovine thrombin, followed by the 
addition of another 1 unit/ml thrombin and incubation for an 
additional 60 minutes. The thrombin digestions are terminated by 
heating for 10 minutes at 90°C in 0.01% SDS. The major thrombin 
digest product of the 76,000 dalton protein is a polypeptide, 
69,000 daltons (69 Kd) . 

Less than 1 ag of the 69 Kd polypeptide species described 
above is iodinated to serve as a radioactive marker after SDS gel 
electrophoresis, in accordance with the procedure of U. K. 
Laenmli, Nature . 227:680 (1970) , the disclosure of which is 
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incorporated herein by reference. The polypeptide, dissolved in 
TAS buffer (50 mM Tris-acetate (pfi 7.8) , 0.1% SDS) is added to 
100 & of the same buffer containing 5 mCi of carrier-free 
iodine-125. 50 ]*l of 2.5 mg/ml chloramine T (Baker) in TAS 
buffer are added and the solution agitated for 1 minute. She 
reactions are stopped by adding 50^ 2.5 mg/ml sodium 
metabisulfate in TAS buffer followed by 1 minute agitation. 
Labeled protein is separated from unincorporated ll25 by 
chromatography using a small volume Sephadex G-25 M column 
(H>-10, Pharmacia) . One column is pre-equilibrated with several 
column volumes of TAS buffer containing 2.5 ng/ml sodium iodide. 
The void volume is collected and protein integrity analyzed by 
SDS gel electrophoresis in accordance with the procedures of 

Laemmli, (1970) / SUEES.. 

The radioactively labeled 69 Kd protein is added to its 
unlabeled counterpart for subsequent monitoring. The protein is 
then individually electrophoretically concentrated. The protein 
solutions are adjtsted to 0.1% SDS, 10 mM dithiothreitol and 
Cocmassie brilliant blue (Serva) are added to make the solution a 

very pale blue. * . 

The solutions are then dialyzed briefly (1-2 hours) in 
TAS buffer using Spectrapore dialysis tubing (made by Spectrum 
Medical industries, Inc., with molecular weight cut off at about 
14,000) . The dialyzed protein samples are then placed in an 
electrophoretic concentrator, of the design suggested by 
Hunkapiller, et al f^j-h, Kpgvmol,, Enzyme Structures, Part I, 
91:227 (1983) , and electrophoretically concentrated in TAS buffer 

for 24 hours at 50 volts. 

The 69 Kd AHF polypeptide, concentrated by the above 
procedure, is electrophoresed through an SDS-polyacrylamide gel 
in accordance with Laemmli, supra. The purified protein, 
identified by autoradiography of the radioactively labeled tracer 
polypeptide, is excised from the gel and electroeluted and 
concentrated as descibed by Hunkapiller et al., supra,. The 
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concentrated sample is suitable for direct amino acid sequence 
analysis using the gas phase seguenator as described in Hewick et 
al., .T- Biol, Chm. 256:7990 (1981), 

Die amino terminal sequence extending f ran the 2-42 
residue of the 69,000 dalton thrombin cleavage product is as 
depicted in Figure 1. The first residue "X" was not 
identifiable. The amino acid sequences of (A) the amino terminus 
and (B)*a 40 kd bovine thrombin digestion product frcm the 166 kd 
AHF fragment noted by Pass et al., su pra, are shown in Figure 2. 
The amino acid sequence of the amino terminus of the 76 Kd 
polypeptide is: X-Ile Ser Leu Pro Thr Phe Gin Pro Glu Glu Asp Lys 
Met Asp T^r Asp Asp lie Phe. 

Example 2 

Chemical Synthesis of Oligonucleotide Probes 
for a Porcine AHF Gene 

(a) Pentapeptide Probe Pool 

The partial amino acid sequence of a fragment of porcine 

AHF having been determined allows porcine AHF oligonucleotide 

probes to be designed and synthesized. Frcm the genetic code 

(Table 1) it is possible to predict the gene sequence that codes 

for this sequence of amino acids. Because the genetic code is 

degenerate, there are more than one possible ENA coding sequences 

for each amino acid sequence. Accordingly, a plurality or pool 

of complementary oligonucleotide probe sequences are provided for 

a region of the AHF molecule which required only a reasonable 

number of oligonucleotides to ensure the correct ENA sequence. 

Such regions are selected by searching for tracks of five to 

eight contiguous amino acids which have the lowest degeneracy. 

After the region is selected, a pool of oligonucleotides is 

synthesized, which would include all possible ENA sequences which 

could code for the five to eight amino acids in the selected 

region. 

In the 69,000 dalton thrcmbin-cleavage fragment of 
porcine AHF there is a pentapeptide sequence running frcm the 
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18th through the 22nd amino acid from the amino terminus, which 
could be coded for by up to 16 different DNA sequences, each 
having five codons, for a length of 15 nucleotides. 
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Trp - 


ASp - 


Tyr - 


Gly 


UGG 


GAD 


DAD 


GGU 




or 


or 


or 




GAC 


DAC 


GGC 








or 








GGA 








or 








GGG 



22 
Met 

ADG 



The probes are made by synthesizing a limited number of 
mixtures of oligonucleotides with two to eight or more 
oligonucleotides per mixture. These mixtures are referred to as 
pools. Enough pools are made to encompass all possible coding 
sequences. 

dese oligonucleotides can be synthesized manually, e.g., 
by the phospho-tri-ester method, as disclosed, for example in R. 
L. Letsinger, et al., ,T. Hffl. »C. 98:3655 (1967), the 
disclosure of which is incorporated by reference. Other methods 
are well known in the art. See also Matteucci and. Caruthers, SU 
m : rv-^n. soc. 103:3185 (1981) , the disclosure of which is 

incorporated by reference. 

Preferably, however, the synthetic oligonucleotide probes 
for the desired polypeptide sequences are prepared by identical 
chemistry with the assistance of the completely automatic Applied 
Biosystems DMA synthesizer, Model 380A, as indicated above. 

The oligonucleotides thus prepared can then be purified 
on a reverse HPLC column, as described by H. Fritz, et al., 
. p^hwniatrv. 17:1257 (1978) , the disclosure of which is 
incorporated herein by reference. After detritylation with 80% 
HQAC the resulting oligonucleotide is normally pure and can be 
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used directly as a probe. If there are any contaminants, the 
synthetic DNA. can be further purified on the same HPLC column, 
preferably using a slightly different gradient system. 

Oligonucleotides are labelled, e.g. by using t -32p]ATP 
and T4 polynucleotide kinase, and their sequence checked either 
by two-dimensional homochromatography as described by Sanger et 
al., a.s.A. 70:1209 (1973) or by the Maxam-Gilbert method, 

Mrt-h. Pn^vnploav . 65:499 (1977) , the disclosures of both of which 
are incorporated herein by reference. 

(b) Forty Five Nucleotide Probes 
A unique aspect of the present invention has been the use 
of oligonucleotide probes to screen a genomic DNA library for the 
AHF gene or fragments thereof. While oligonucleotides have been 
used for screening of cENA libraries, see, e.g. H. Jaye, et al, 
Mi^leic Acifo Research 11:2325 (1982) , the disclosure of which is 
incorporated herein by reference, genomic libraries have 
previously been screened successfully only with cDNA probes, i.e. 
probes which were generated only after the tissue source of the 
mENA for a described protein had been identified and utilized to 
generate a cDNA clone which precisely matched the DNA sequence of 
the gene sought by the genomic search. 

In the present case, it has been shown possible to use 
oligonucleotides to identify gene segments in a genomic library 
which code for the amino acid sequence of the proteins of 
interest, and identification of such gene segments provides an 
exact probe for use in mENA, cDNA or further genomic screening 
techniques. 

Preferably, as here, oligonucleotides corresponding to at 
least two segments of the amino acid sequence of the protein of 
interest are utilized. Preferably at least one of the 
oligonucleotide probes is used in the form of one or more pools 
of oligonucleotides which in the aggregate include every possible 
DNA sequence which could code for the amino acid sequences 
selected. Preferably one relatively short probe, e.g. 11 to 25 
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nucleotides, preferably 15 to 20 nucleotides is utilized in 
conjunction with a relatively long probe, e.g. 30 to 200 
nucleotides, preferably 40 to 50 nucleotides. The second probe 
can be used for confirmation, and is not always necessary for 
identification of the DNA segment. Preferably at least one of 
the probes, and more preferably the longer of the probes is 
designed in accordance with the Rules 1 to 4 described below. 

Rule 1. f^fltt" Preference In the absence of other 
considerations, the" nucleotide sequence was chosen which matched 
prevailing or similar sequences in similar mammalian genes. See 
f v^ehanisms o f freeing Dev^, 18; (1982) . 

Rule 2. ^n^aeous c?T Pairim The nucleotide G, in 
addition to bonding to its compliment C, can also form weak bonds 
with the nucleotide T. See K. L. Agarwal et al., ,7„ Tfrtoln Cbm« 
256s 1023 (1981) . Thus, faced with a choice of G or A for the 
third portion of an ambigious codon, it is preferable to choose 
G, since if the resulting hybridization would occur even if the 
actual nucleotide in the position is a T, rather than a C, the 
hybridization would still be stable. If an A were chosen 
incorrectly, the corresponding A:C inconpatability could be 
enough to destroy the ability of the probe to hybridize with the 

genomic DNA- 

Ruie 3. ay^i dance of s'cfi sequences 

" when selecting from among the possible ambiguities, 

select those nucleotides which will not contain the 

sequence, either intra codon or inter codon. 

Rule 4. Mismatch Position In choosing the codon 

sequences to use, consideration was given to the postulation that 

mismatches near the ends of the molecule do not adversely affect 

the stability of the hybridization as do mismatches near the 

center of the molecule. Thus, for example, where substantial 

doubt occurred concerning the sequence of a particular codon, and 

that codon was close to the center of the probe, the tendency was 

to test a pool of possible nucleotide sequences for that codon, 
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whereas codon positions near the ends of the probe were more 
likely to be subject to determinations on the basis of codon 
preference . 

The chosen sequence for the 45-mer probes, as well as the 
amino acid sequences, the possible UNA sequences , and the "Actual 
Probe Sequence" , i.e. the corrplenent of actual coding strand 
determined for the AHP exon, as shown in Figure 3. 

Tbus, out of nucleotide positions involving choices, 
three were covered by using pools containing both possible 
nucleotide alternatives, five were predicted correctly, one was 
predicted in a way to maintain neutrality though incorrect, and 
four others were in error. 

Despite the approximately 11% mismatch (5/45) the pool of 
45-mer oligonucleotides are adequate to strongly identify a 
porcine fiHF gene fragment, as described below. 

Exarrple 3, Screening of Porcine Genomic VlTPrY 

A porcine genomic library is constructed using the 
bacteriophage vector Lambda Jl. Lambda Jl is derived frcm L47.1 
(Loenen et al., Gerie 20:249 (1980)) by replacement of the 1.37 kb 
and 2.83 kb Eco RI-Bam HI fragments with a 95 bp Eco El-Hind 
III-Xba I-Bgl II-Bam HI polylinker. The 6.6 kb Bam HI fragment 
is then present as a direct repeat in reverse orientation 
relative to L47.1. The cloning capacity for Bam HI fragments is 
8.6 - 23.8 kb. Bam HI cleaved porcine DNA (prepared as described 
by Piccini et al., Cell . 30:205 (1982)) is phenol extracted, 
ethanol precipitated and concentrated by centrifugation in a 
Microfuge. 0.67 ug of Bam HI porcine DNA is ligated to 2 yg of 
Lambda Jl Bam HI "arms", prepared as described in Maniatis, et 
al., supra - pp 275-279, in a volume of 10 al ligation buffer with 
10 units T4 DNA ligase (Maniatis et al., sUBLZLr p 474) . The 
ligated D$& is packaged and plated as -described in Maniatis et 
al., supra r p 291. 

Approximately 4 x 105 pfu are plated on JEL«_ coli stain 
C600 on 15 cm plastic petri plates containing NZCT4 agarose, at 
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8,000 pfu/plate. Ifcese recombinant phage are screened by the 
method of Woo (1979) using the 45-mer probe described above, 
radioactively labeled with 32p as described above, as a probe. 
Filters are then hybridized in 5xSSC, 5x Denhardt's, 0,1% SDS, 
and 5 x 106 cpm/ral probe at 45° C for 16 hours, washed in 5 x 
SSC, 0.1% SDS at 50° C and subjected to autoradiography using 
intensifying screens (DuPont Lightning-Plus) . Autoradiography 
reveals numerous phage which hybridized, to varying degrees, with 
the 45-mer. The filters are then denatured in 0.SM NaOH, 
neutralized in 1.0M Tris pH7.5, 1.5M Nad and hybridized to the 
15-mer as described for the 45-mer except the hybridization and 
washing temperature is 37° C. One phage which hybridized to both 
probes is picked from the original plate and 100 pf u plated and 
the plaques screened as described above using the 15-mer as 
probe. 

A positive phage, named PB34, is picked as a plug and 
used to make a plate lysate as described in Maniatis et al, 
SUBEa, pp 65-66. A small-scale isolation of PB34 DMA is achieved 
using the procedure described in Maniatis et al., supra , pp 
371-372. lOiil of this DNA was cut with the restriction enzyme 
Hae III and then phosphatase^ using calf alkaline phosphatase 
(Boehrihger-Mannheim) . After phenol extraction, 20 ng of sma I 
cut-Ml3mp8 DNA is added, the solution is made 0.2M NaCl and 
nucleic acid precipitated by the addition of 2 volumes ethanol. 
precipitated DNA is pelleted by centrifugation and redissolved in 
2 al of a ligation mixture and the DNA is ligated for 30 minutes 
at 23° C, diluted to 50 jjl with ligase buffer and ligated an 
additional 3 hours. 5 uJL of this reaction is used to transform 
E. coli strain JM101/TG1. 

Cells are made competent for transformation by growing to 
an OoD.600 of 0.5 at 37° C in 50 ml SCBM media (SOBH is 2% 
tryptone, 0»5% yeast extract, 0.1M NaCl, O.llg KOH per liter, 
20mM MgSo4) . Cells are pelleted by centrifugation at 2500 rpm 
for 10 minutes at 4° C. The cells are resuspended in 3.5 ml 
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lOOntf RbCl, 45ntt MnCl2/ 50nM CaCl2/ lOmM potassium HES pH 6.4 
(MES = raethylethane sulfonic acid) . 200ul of competent cells are 
transformed with the ENA contained in 5 ml of the ligation 
reaction at 0° C for 30 minutes. The cells are then heat-shocked 
at 42° C for 90 seconds after which 4 ml of 0.8% agarose/SOBM 
containing 100 ill stationary JM101/TG1 cells are added and plated 
on 10 cm SCBM agar petri plates. 

A subclone/ containing a Hae III fragment from FB34, 
hybridizing to the 15-mer is identified by screening using the 
procedure of Benton and Davis, supra . Ibis clone is isolated and 
prepared for use as a tsnplate, The DMA. sequence of the Eae III 
fragment present in this clone, designated as 34-HL , is shown in 
Figure 5. ML3 template ENA is prepared by growing 1.5 ml of 
infected cells 5 hours at 37° C. Cells are pelleted by 
centrifugation for 10 minutes in a Beckman microfuge. 1.0 ml of 
supernatant (containing virus) is removed and 200 iil of 20% 
polyethylene glycol , 2.5 M Nad is added. This sample is then 
incubated at room temperature for 15 minutes followed by 
centrifugation for 5 minutes in a Beckman microfuge. The pellet 
is dissolved in 100 jjl TE, 7.5 iiL of 4M NaCHOOO pH 4.5 is added 
and the sample extracted twice with a 1:1 mixture of 
phenol-chloroform and once with chloroform. Hie single-stranded 
phage ENA is then precipitated by the addition of 2 vol ones of 
ethahol. Precipitated ENA is pelleted by centrifugation in a 
Beckman microfuge and dissolved in 30 iil lutt Tris pH 8,0/ 0.1 mM 
EDTA. The ENA sequencing is performed by the dideoxy chain 
termination technique, see, e.g. Sanger et al. , FN&S U.S.A. , 
74:5463 (1977) , utilizing the 15-mer as a primer. The sequence 
observed, which is included in the sequences represented in Figs. 
4 and 5, confirmed the subclone as containing a porcine AHF exon 
as it includes the identical 14 amino acids encompassed by the 
region phenylalanine2 to glutamineis in the amino terminal 
sequence of the 69K fragment represented in Figure 1* Further 
confirmation came from sequencing from the 5' end of the Hae III 
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insect in the 34-fflL vector, by priming with the "Universal 
primer" (Bethesda Research Labs) at a point adjacent the 
tolylinker in that vector. Also, the insert fron this clone, 
named 34-H1, was recloned into Ml3mo9 by restricting the Cffl with 
Eco RI and Hind III, phosphatasing with calf alkaline 
phosphatase, and ligating to Eco RI, Hind III cleaved Ml3mp9 DNA. 
This clone, which contains an inversion of the Hae III segment 
relative to the universal primer, was also sequenced as described 
above. The resulting sequence data for all of insert 34-Hl is 
shown in Figure 5. This sequence confirms that this subclone 
contains an exon of the porcine AHF gene that could encode, from 
nucleotides 169 to 267, at least the thirty amino acids from the 
phenylanine 2 through arginine 3 l of the 69K fragment (Fig- 1) - 

It appears likely that arginine31 borders an intron 
because termination codons can be found downstream from 
nucleotide 267 (Fig. 5) in all three reading frames, and a 
sequence similar to the consensus 5' splice site sequence is also 
found in that region between nucleotides 266-267. Further, the 
amino acid sequence which would be encoded by the downstream DNA 
differs completely from that observed in the 69 Kd fragment of 
porcine AHF. 

FB34 DMA was cut with Bam HI and electrophoresed through 
an'agarose gel and the bands visualized by ultraviolet light 
after staining the gel in 5 ug/ml ethidium bromide. Three 
inserts of approximately 6.6 kb, 6.0 kb, and Io8 kb were 
observed. The ENA in the gel was transferred to nitrocellulose 
as described in Maniatis et al., PP 383-386. 

Hybridization of the filter to the 15-mer and autoradiography 
were, performed as described above. Autoradiography revealed that 
the 6.0 kb band contained an AHF gene fragment which hybridized 

to the 15-mer probe. 

Thus, for the first time, a section of the gene coding 
for porcine AHF had been isolated and identified. The 
bacteriophage lambda recombinant clone PB 34 is on deposit at the 
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American Tyve Culture Collection under Accession Number ATCC 
40087* 

Example 4. Trration of ffiiman AHF Gene 

She human genomic library described by Maniatis et al. r 
Ceil , 15s 687 (1978) is screened for the human AHF gene by 
infecting E. coli strain LE392 (publicly available) with 6x105 
pfu and plating on 15an NZC£M agar plates at a density of 20,000 
pfu/plate. Ihese phage are screened using the procedure of 
Benton and Davis, supra with the 6.0 kb porcine AHF fragment 
described in Example 3 , labeled with 32p by nick translation, as 
the probe. A phage exhibiting a strong hybridization signal is 
picked and plated at about 100 pfu/10cm plate and screened in 
duplicate as described above using the radioactively labeled 
45-mer as one probe and the 6.0 kb Bam HI fragment of PB34 as the 
other. A phage, named HH25, which hybridizes to both probes is 
identified, a plate stock made and rapid prep DMA prepared as 
described above. Hie phage ENA is cut with Sau3A I, phosphatased 
with calf alkaline phosphatase phenol extracted, and 
co-precipitated with 20 ng of Bam HI cut M13 mp8 ENA. 
Precipitated DNA is pelleted by centrif ugation and redissolved in 
2 al ligase buffer containing T4 ENA ligase. Ligation is 
perfbrmed for 2 minutes at 16° C, diluted to 50 iiL in ligase 
buffer containing T4 DNA ligase and incubated an additional 3 
hours at 16° C. 5 jol of this reaction mixture is used to 
transform E. coli strain JM101/TG1 as described in Example 3 
above. 

The plagues are screened using the Benton and Davis 
procedure and probing with radioactively labeled 15-mer. A phage 
plaque, named 25-S1, exhibiting hybridization is isolated and 
single-stranded phage DNA prepared for use as a ENA sequencing 
template as described above. Sequencing is performed using the 
dideoxy chain termination technique described by Sanger et al. , 
supra , utilizing the 15-mer as primer, and gives the information 
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strand DNA sequence shown in Figure 6. 84 nucleotides sequenced 
in this manner demonstrated 85% homology with the homologous 
region of porcine AEF. There , is also only one amino acid 
difference between the porcine AHF 69K region 2-16 as shown in 
Figure 1 and the corresponding region which was deduced from the 
human nucleotide sequence of Figure 6. Ihis high degree of 
homology shows that the ENA of the recombinant phage HB-25 
emanates from the AHF gene. 

Thus, for the first time, an exon for the human AHF gene 
has been isolated and identified. A bacteriophage lambda 
recombinant HH 25 is on deposit at the American Type Culture 
Collection under the Accession Number ATCC 40086. 

F^r1° * T^ntjfy^rv7 r^iis actively Transcr i bing art 

The porcine and human AHF exons described above are 
useful for a variety of functions, one of which is as a screening 
agent which permits identification of the tissue which is the 
site of synthesis of AHF in 2iyo_. A number of screening methods 
are available, based on the use of the exon as an exact 
complement to the mRNA which is produced during the course of 
natural expression of AHF. In the screening procedures, tissue 
from various parts of the body is treated to liberate the mRNA 
contained therein, which is then hybridized to a ENA segment 
containing the exon for AHF, and if a molecule of mRNA does 
hybridize to that exon, the tissue which is the source of that 
mRNA is the source of AHF. 

1 r scree ns Procedure 

Porcine or human tissue from various organs, including 
kidneys, liver, pancreas, spleen, bone marrow, lymph nodes, etc., 
is prepared by guanidine hydrochloride extraction as described by 
Cox n-n r^ Pn*v™l- - 12B:120 (1968) , the disclosure of which is 
incorporated herein by reference, with some modifications. 
Briefly, tissue is explanted into 8 M guanidine . hydr oc hl or ide Cor 
4M guanidine isothiocyanate as proposed by Chirgwin, et al. , 
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Biochemistry 18: 5294 (1979) , the disclosure of which is 
incorporated herein by reference, see also Maniatis, et al. , 
supra , at 189 et seg.) , 50 ntt Tris (pH 7.5) , 10 mM EttEA and 
homogenized in an Onnimixer (Sorvall) at top speed for 1 minute. 
Uie hanogenate is clarified at 5000 rpm for 5 minutes in a Sorval 
HB-4 rotor and the EHA is precipitated by the addition of 0.5 
volumes of ethanol. The KNA is dissolved and precipitated 3 more 
times f rem 6 H guanidine hydrochloride before being dissolved in 
H20. 

Messenger ENA f rem this pool is enriched by 
chranatography on oligo (<JT) cellulose (Collaborative Research) . 

This mENA is then subjected to electrophoresis through an 
agarose gel containing formaldehyde, as described by Maniatis et 
al., supra, at 202-3. She mENA in the gel is then transferred to 
a nitrocellulose filter (Maniatis et al-, sagca/at 203-4) . 

Hie thus obtained mENA is hybridized with the 
radiolabeled porcine or human exon DMA obtained as described 
above, and the existence of hybrids' detected by autoradiography. 
A radioactive signal indicates that the tissue source of the mENA 
is a source of synthesis of AHF in the body. 

Alternatively, mBNA can be screened using the Si 
protection screening method. 

Si nuclease is an enzyme which hydrolyzes single stranded 
DNA, but does not hydrolyze base paired nucleotides, such as 
hybridized mENA/DNA. Thus the existence of a radioactive band 
after acrylamide gel electrophoresis and autoradiography shews 
that the single stranded ENA corresponding to the AHF exon has 
been protected by the complementary mENA, i.e. the mENA for AHF. 
Thus the tissue which was the source of that mENA is a site of 
synthesis of AHF in vivo - lhat tissue can be used as a source 
for AHF mENA. This method of screening can be somewhat more 
sensitive than screening method described above* 

A probe consisting of single stranded radiaactively 
labeled ENA cacplementary to the mENA is synthesized using the 
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universal primer of M13 to prime DNA synthesis of the porcine 
genomic subclone 34-H1. Hie reaction is performed in a 100 uj. 
solution of 50 mM Tris pH 7.4, 5 mM Mga2/ 1 mM 
2-mercaptoethanol, 50 mM Nad, 40 ujl dGTP/ dSTP, dCTP, 60 iiCi of 
32p-dATP (400Ci/nmole) , 10 ng universal primer, 200-400 ng of 
34-53 template ENA, and the KLenow fragment of ENA polymerase I 
(E. coli) . The reaction is incubated at 23° C for 60 minutes, 10 
minutes at 70° C, 50 units of PstI added and incubated an 
additional 60 minutes . 

fflie reaction is terminated by phenol/chloroform 
extraction, Nad added to 0.2M and then precipitated with two 
volumes 100% ethanol. Hie precipitated ENA is pelleted by 
centrifugation, redissolved in 20% sucrose, 50mM NaOH, 0.1% 
cresol green and then electrophoresed through 2% agarose in 50 mM 
NaCH, 10 mM EDIAo Uie resulting single stranded fragment is 
localized by autoradiography, the band excised and DNA isolated 
by electroelution in dialysis tubing . 

Sample mRNA is prepared from liver, spleen, etc. tissue 
by the guanadine hydrochloride method described above. 

The probe is then hybridized to sample mRNA (obtained 
from the oligo (dT) chromatography enriching step) in 50% 
formamide, 0»4M NaCL, 40 mM PIPES [piper azine-NpN* 
bis(2-ethanesulfonic acid) 3 pH 6.5, 1 mM EDTA, 5-50 ug mRNA, 2 ug 
labeled DNA in a volume of 15 al. TSie hybridization is 
terminated by the addition of 200 ul cold SI nuclease buffer 
(0.25 M NaCL, 0.3 M NaCH3O0OtpH 4.5] , 1 mM ZnS04, 5% glycerol, 
and 1000 units Si nuclease. The reaction is incubated at 45° C 
for 30 minutes. Samples are phenol extracted, ethanol 
precipitated with 10 ag yeast tRNA carrier and subjected to 
analysis by electrophoresis through 5% polyacrylamide sequencing 
gels as described in Maxam-Gilbert, PHAS WSft, 74:560 (1977) . 

cyTTP 1 ' 3 fi ttee of AHP FSton DNA t-n rbtain AHF mRNA From Tissue 

Once a tissue which is a source of mRNA is identified, 
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AHF mENA f rem that tissue is extracted and used to construct a 
cENA library, The cENA. library is then utilized for identifying 
and constructing a full length cDNA clone which encodes the amino 
acid sequence for AHF, without the introns contained in the 
genomic clone ,as described below* Thereafter, the cDNA which 
encodes the AHF protein is inserted into an appropriate 
expression vector, in an appropriate host, for expression of AHF. 

Since human AHF is in great demand for treatment of 
hemophilia and other uses, the preparation of human AHF cDNA is 
described. 

1. Obtaining mKNA for Human AHF 

mKNA from the human tissue responsible for AHF synthesis 
is prepared by the guanidine hydrochloride extraction method as 
described by Cox and modified by Chirgwin, et al. as disclosed 
above. 

Further fractionation of mFNA obtained f ran the oligo 
(dT) cellulose chromatography column is obtained by sedimenting 
on 5-20% sucrose gradients containing 10 mtt Tris-ECl (pH 7.4) , 
ImM EOTA and 0.2% SDS by centrif ugation for 24 hours at 22,000 
rpm in a Beckman SW28 rotor. Fractions (1.0 ml) are collected, 
sodium acetate was added to 0.2M, and the fractions are ethanol 
precipitated twice before dissolving in water. The size 
distribution of the fractionated RNA is determined by 
electrophoresis through 1.4% agarose gels containing 2.2 M 
formaldehyde. 

mHNA sedimenting with an S (Svedberg) value greater than 
28 is pooled for the synthesis of double stranded cENA. 10 ijg of 
this RNA is denatured at roan temperature in 10 al of 10 irW 
methylmercury hydroxide . 140 mM 2-mercaptoethanol is added to 
inactivate the methy liner curyhydroxide . The RNA is then diluted to 
50 uL containing 140 m KC1, 100 nM Tris-Hd (pH 8.3 at 42°C) , 1 
mM of each deoxynucleotide triphosphate, 200 iag/ral 
oligo (dT) 12-18, 10 mM MgCl2, and 0.1 uCi 32jH3CTP/ml. These 
reactions are performed at 42°C for 1 hour after the addition of 
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3 uL of 17 units/Hi AMV reverse transcriptase (Life Sciences) . 
The reaction is terminated by the addition of 0.25 H EOTA ( P H 
8.0) to 20 inM. The resulting mixture is extracted once with an 
equal volume of phenol/chloroform (1:1) followed by one 
chlorofom extraction. The sample is then chranatographed on a 5 
ml Sepharose CL-4B column (Pharmacia) equilibrated in 10 mM 
Tris-HCl ( P H 8.00, 100 mM NaCl, 1 .mM EDIA. The void volume is 
collected and the nucleic acids (including any HNA/cDNA hybrids) 
are precipitated by the addition of 2.5 volumes of ethanol. 

Preferably in conjunction with the above procedures an 
AHF exon oligonucleotide segment is also used in place of oligo 
dTr to prime the reverse transcription, as described in Ullrich 
et al., Ba£UE£r 303:821 (1983). 

The RNA-cENA hybrids are dissolved in 35 ul of deiomzed 
H 2 0, made 100 mM potassium cacodylate (pH 6.8) , 100 litf dCTP, 1 mM 
2-mercaptoethanol, 1 mM cobalt chloride and enzymatically 
-tailed" by the addition of 10 units of deoxytidyl terminal 
transferase (pH Biochemicals) and incubating the reaction for 30 
seconds at 37°C. The reaction is terminated by adding 0.25 M 
EOTA to 10 ntt. Tris-HCl (pB 8.0) is added to 300 mM and the 
sample extracted once with an equal volume of phenol-chloroform 
(1-1) and then with an equal volume of chloroform. Nucleic acids 
are precipitated from this product by the addition of 2.5 volumes 
of ethanol. 

The dC tailed hybrid molecules are then annealed with 170 
us/ml oligo (dG) 14-18 cellulose in 10 mM KCl, 1 mM EDTA for 10 
minutes at 43°C and then an additional 10 minutes at 23°C. This 
reaction product is then diluted to 100 ul containing 100 mM 
anrnonium sulfate, 1 mM 2-mercaptoethanol , 100 ntt MgCl 2 , 100 ag/WL 
bovine serum albumin (Sigma, Oohn fraction V) , and 100 uM 
nicotinamide adenine dinucleotide. Second strand cDNA synthesis 
is initiated by the addition of 1 unit BNase H (P-L 
Biochemicals) , 1 unit of £. SSli DNA ligase. and 10 units of ENA 
polymerase I and incubated at 16 °C for 12 hours. 
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The sample is then chrcmatographed over Sepharose CL-4B 
as described above. Double stranded DNA is ethanol precipitated 
and dC tailed as described for the ENA-cENR hybrid tailing, 

2. Screening for Human SHP DMA 

dC tailed double stranded cENA obtained as described 
above is annealed with an equimolar amount of dG tailed pBR322 
(New England Nuclear) in 10 1*1 Tris-ECL (pH 8.0) r 1 nM EETA, 100 
ntt NaCl at 37° for 2 hours. The annealed chimeric molecules are 
then frozen at -20°C until use in bacterial transformation. 

Bacterial transformation is done using the MC1061 strain 
of £. coli (source) . Cells (50 ml) are grown to an optical 
density of 0.25 at 600 ran. Cells are concentrated by 
centrifugation at 2,500 rpra for 12 minutes, washed in 10 ml of 
sterile 100 mM CaCl2r and again pelleted by centrifugation as 
described above. Cells are resuspended in 2 ml of sterile 100 mtf 
CaCl2 and kept at 4° for 12 hours. The annealed chimeric 
molecules are incubated at a ratio of 5 ng of double stranded 
cENA per 200 liL conpetent cells at 4 C C for 30 minutes. The 
bacteria are then subjected to a two minute heat pulse of 42°C. 
1.0 ml of L-broth is then added and the cells incubated for 1 
hour at 37°. Cells are then plated onto IB-agar plates 
containing 5 isg/ml tetracycline. 

Human AHF clones are identified using the colony 
hybridization procedure of Grunstein and Hogness PNftS U.S.A. 
72:3961 (1975) r the disclosure of which is incorporated herein by 
reference. The cDNA library is plated onto a nitrocellulose 
filter (Schleicher and Schuell) overlaying L-broth/ 5 ug/ml 
tetracycline agar plates. Colonies are grown overnight at 37°C 
and then the filter placed on sterile Whatman 3 M paper. A 
pre-noistened nitrocellulose filter is then pressed against the 
master filter and the filters keyed using an 18 gauge needle. 
The replica filter is then grown on LB-tetracyclijne* plates- at 
37°C until the colonies reached a diameter of 1-2 ran. The 
filters are then transferred to IB plates containing 150 ug/ml 
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chloramphenicol and incubated at 37°C for 16-24 hours. 

Filters are then removed and placad onto Whatman 3 M 
paper saturated with 0o5 M NaOH for. 5 minutes at room 
temperature. Filters are then neutralized by placing onto 
Whatman 3 M saturated with 1 M Tris-HCl (pH 7.5) , 1.5 M NaCl, and 
then Whatman 3 M saturated with 2x standard saline citrate (SSC) . 
SSC <lx) is 0.15 M NaCL, .015 M sodium citrate. 

Filters are air dried and baked in vacuo at 80°C for 2 
hours, prehybridization of filters is done at 65°C for 30 
minutes in 10 mM Tris-HCl (pH 8.0) , 1 nM EDTA, .1% SDS followed 
by 30 minutes in 7x SSC, 5x Denhardt's (Ix Denhardt's is 0.02% 
polyvinylpyrollidine, 0.02% ficoll, 0.02% bovine serum albumin) , 
100 ug/ml denatured salmon sperm DNA, and 0.1% SDS. 32p-iabelled 
human exon DNA, prepared as described above, are added to 106 
cpm/ml and the hybridization performed 12-16 hours at 37 °C. 
Filters are then washed in several charges of 7x SSC, .1% SDS for 
1-2 hours at 37°C. Filters are then air-dried and subjected to 
autoradiography with Kodak XAR film and a Dupont Lightning Plus 
intensifying screen. 

Those colonies showing a hybridization signal above 
background are grown in L-broth containing 50 ijg/ml tetracycline 
for rapid prep purification of plasmid DNA. Plasmid DNA is 
purified by the method of Holmes et al.. Anal. Bipchem. , 114s 193 
(1981) , the disclosure of which is incorporated herein by 
reference. An aliquot of this DNA is cleaved with restriction 
endonuclease Pst 1 and the fragments electrophoresed through 1% 
agarose/TBE gels and blotted according to the procedure of E. 
Southern, .t_ moI. Biol. 98:503 (1975), Methods mvmpl, 69:152 
(1980) , the disclosure of which is incorporated herein by 
reference. The nitrocellulose filters are hybridized with . 
radiolabeled human AHF exon DNA as described for colony 
hybridization. Those plasmids which contained Pst I inserts 
hybridizing to the AHF exon DNA are used for DNA sequencing 
analysis. 
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For sequencing/ plasmid DNA, (purified from 0.75 ml of 
culture by the procedure of Holmes, et al. , , is digested 

to completion with the restriction endonuclease Sau 3al. She 
resulting ENA segment, identified as 34-S1, has a ENA sequence 
which is depicted in Figure 4. The DNA is ethanol precipitated 
after extraction with phenol-chloroform and redissolved in 10 al 
TE (10. mM Tris-HCL pH 8.0, J mM EOTA) . 5 yl of the DNA solution 
is ligated with 20 ng of Bam HI cleaved M13 mp 9 replicative form 
ENA in 100 ul of 50 mM Tris-HCl (pH 7.4) , 10 mM MgCl2, 10 mM 
dithiothreitol, 1 mM ATP and an excess of T4 DNA ligase. 
Ligations are done at 15°C for 2-4 hours. 

5 ill of the ligation reactions are used to transform 200 
ul of £. coli strain JM101/TG1 as described above. Recombinants 
are identified as white plaques when grown on LB-agar plates 
containing X-^al as an indication for beta-galactosidase activity 
as described by Davies, et al., .t. mqI. Biol. 36:413 (1968) . 

Recombinant phage harboring sequences hybridizing to the 
human exon are identified by the procedure of Benton, et al., 
science 196:180 (1977) , the disclosure of which is incorporated 
herein by reference, using radioactively labelled human AHF exon 
as a probe. Plaques showing a hybridization signal are picked 
and grown in 1.5 ml cultures of L-broth. Single-stranded phage 
ENA prepared from these cultures is used as template in 
oligonucleotide-priraed DNA synthesis reactions. Sequencing is 
done using the dideoxy chain-termination procedure, see, e.g., 
Sanger et al. , PN&g n.s.A. 74:5463 (1977) . 

Human AHF recombinants are identified by comparing their 
nucleotide sequence with that which is known from the human exon 
sequence of human AHF. 

3. Porcine AHF mRNA 

Human AHF recombinants are used to screen a porcine 
tissue library constructed exactly as described for the human 
tissue cDNA library. Prospective porcine AHF recombinant DNA 
clones are identified by the Grunstein-Hogness procedure using 
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the porcine 32 p labeled exon fragment as a probe as described 
above. The probe is the porcine AHF exon segment labelled with 
32P by nick-translation as described by Rigby et alo, Jn ft&^ u 
BifiL., 113.237 (1977) , the disclosure of which is incorporated 

herein by reference. 

Colonies exhibiting hybridization signals are grown for 
rapid prep plasaid purification purposes as described above. 
Plasmid DBA is cleaved with the restriction endonuclease Pst 1, 
electrophoresed through 1% agarose/TBE gels, and blotted 
according to the procedure of E. Southern (1975) . The blots are 
hybridized with nick-translated porcine AHP recombinant DNA 

labelled with 32p. _ 
Full-length clones may be constructed in a conventional 

nanner, such as by ligation of Cffi fragments from overlapping 

clones at restriction enzyme sites common to both clones as is 

well known in the art ("gene walking") . 

4o TflfflfrftlfYinq fttn-l«noth clones from SfcgBS 3 PE— 3. 

The distance between the 5' end of an existing clone and 

the 5' end of the mRNA. can be analyzed by the primer-extension 

technique described in Agarwal et al., "J.B.C." 2S&< 2) ,1023-1028 
(1981) utilizing an oligonucleotide printer whose sequence comes 
fron the 5' (amino-terminal) region of the existing AHF clone,, 
If gels developed using this procedure show more than one 
transcript one should consider the most intense band as 
representing the full mffiBV transcript. 

There are, however, many mENAs which contain a long 
region of 5« untranslated sequence. Thus, expression of human 
AHN DNA may not be contingent upon the acquisition of a complete 
cena clone. For example, a clone may be obtained which by BHA 
sequencing analysis demonstrates the existence of a methionine 
codon followed by a sequence which is analogous to or identical 
to a known eucaryotic protein secretion signal and which is in 
frame with the remaining codons. Transformation and expression 
should be conducted for a clone which contains the met-secretion 
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signal sequencer which is of the expected size and which contains 
a poly (T) 3 f terminus. 

5. alternative Procedure 

As an alternative to the methods in Sections 1-3 
described above f it is preferred that the cDNA clones for porcine 
or human AHF be identified using a bacteriophage vector in the 

following manner. 

mFNA from human fetal liver tissue responsible for AHF 
synthesis was prepared by the guanidine hydrochloride method as 
described by Cox and modified by Chirgwin, et al. as set forth 
above in Example 5, Section 1. First strand cENA was synthesized 
from 10 ug of polyA+ fetal liver ENA by the procedure described 
in Example 6, Section 1, supra . S&ecif ically r 10 ag of this RNA 
was denatured at room temperature in 10 uL of 10 mM 
me1±ylmerairyhydroxid^ 140 mtt 2-mercaptoethanol was added to 
inactivate the methylmercuryhydroxide- The RNA was then diluted 
to 50 i£L containg 140 m KC1, 100 mM Tris-HCl (pH 8.3 at 42 °C> , ^ 
1 m of each deoxynucleotide triphosphate,, 200 ag/rol 
olio (dT) 12-18 , 10 mtt MgCl2r and 0.1 uCi 32p-dCTP/ml. These 
reactions were performed at 42°C for 1 hour after the addition of 
3 111 of 17 units/ul AM7 reverse transcriptase (Life Sciences) . 

For the first strand synthesis of a primer-extended 
liJbrary f 200 piccrooles of a unique canplimentary 38mer was 
included in the C23HgOH denaturation step, and after 10 minutes 
at 23°C, the reaction was made 140mM beta-mer captoethanol , 0.7M 
KC1, lirtt EETEAr 20irf4 Tris-HCl (pH 8.3 at 42°) , 1 unit/ul of RNasin 
(Biotec) and incubated at 50°C for 2 minutes and at 42°C for 2 
minutes. This was then diluted to 50 ul containing 140 itM KC1, 
100 irtt Tris-HCl (pH 8.3 at 42°C) , 1 Mi of each deoxynucleotide 
triphosphate, 10 ntt MgCl2, and 0.1 uCL 32p-dCCP/iiL. The reaction 
was incubated at 42 °C for 1 hour after the addition of 3 iiL of 
17jj/jiL AMV reverse transcriptase. 

After first strand synthesis the reactions were diluted 
to ISOul containing lOnM MgCl2* 50mM Tris pH 7.4, 5mH 
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2-mercaptoethanol, 7.5m NH4SO4, 250 uM of each deoxynucletide 
triphosphate, and second strand synthesis initiated by the 
addition of 1 unit of RNase H <£. fifili) and 45 units of ENA 
polymerse I. Reactions were incubated at 16 °C for 8 hours and 
then terminated by the addition of EDTA to 20mM and brought to a 
final volume of 200uA with H 2 0. EcoRl methylation was then 
performed by addition of S-adenosyl methionine to 50uM and 40 
units of EcoRl methylase. These reactions were incubated for 1 
hour at 37°C, terminated by phenol-chloroform extraction, and 
chromatographed using Sephadex GSO equilibrated in lOmM Tris-HCl 
<ph 8,0) , U*i EDEA, olM Nad* The void volume was pooled and 
precipitated by ethanol addition, 

cDNA molecules were blunt-ended in 200sl containing 50irM 
Tris pH 8,3, lOmM MgCl 2 , lOmM 2-mercaptoethanol, 50mH NaCl, 50m 
of each deoxynucleotide triphosphate, lOOug/ml ovalbumin and 5 
units of T4 polymerase. The reaction was incubated at 37* for 30 
minutes and then terminated by phenol-chloroform extraction. 
Nucleic acids were then precipitated by ethanol addition. 

EcoRl "linkers" (Collabortive Research) were then lxgated 
to the blunted cDNA molecules under standard conditions, Manitis, 
et al., SiEEa, at 243, in a total volume of 45ul. The reactions 
were terminated by the addition of EDTA to 15mW and extracted 
with phenol-chloroform. Nucleic acids were precipitated by 
ethanol and pelleted by centrif ugation. The linkered cim was 
redissolved in 200ul of lOOuM Tris-HCl (pH 7.2) , 5mM MgCl 2 , 50mM 
Nad and digested with 300 units of EcoRl for 2 hours at 37°C. 
The reaction was then extracted with phenol-chloroform and 
chromatographed over Sepharose CL-4B equilibrated with lOmM Tris 
(pfl 8.0) , lmw EOTA, 0.1M NaCl. cONA in the void volume was 
collected, precipitated by ethanol and pelleted by 

centrif ugation. 

CUBA was redissolved in lOmM Tris-HCl (pH 8.0) , ImM EDTA 
and ligated to EcoRl cleaved, phosphatased lambda Charon 21A ENA 
at various ratios of cDNA to vector Oft using standard ligation 
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conditions. Ligated DNA was packaged and titered using 
established procedures f Hanitis r et al. f SSSBLSU at 64, 256. Hie 
library was plated and screened using 32p-iabelled human exon DNA 
under conditions described in Benton and Davis r anjia. 
Overlapping clones which spanned afproxisately 10 f 000 base pairs 
were obtained and a substantial portion thereof was sequenced to 
reveal one long open reading frame encoding hianan AHF. The 
recombinant ENA nucleotide sequence obtained therefrom coding for 
human AHF is shown in Fig. 7 along with the deduced amino acid 
sequence for human AHF. The overlapping clones were assembled 
into vector pSP64 (Pranega Biotec) using conventional techniques 
well known in the art, i.e. by ligation of ENA fragments fran 
averlaping clones at resttriction enzyme sites ccranon to both 
clones. A pSP64 recanbinant clone containing the nucleotide 
sequence depicted in Figure 7, designated as pSP64-VIII, is on 
deposit at the American Type Culture Collection under Accession 
Number ATCC . 

EXAMPLE 7 
l=* pression of human or porcine AHF 

This example contemplates expression of AHF using the 
full length clone obtained by the method of Example 6 in a 
cotransf ormation system. 

1- Prepar ation of Transformation Vector 

A direct method for obtaining the AHF transformation 
vector is described below. The pCVSVL plaanid is partially 
digested with Pst I, the digest separated on gel electrophoresis. 
After visualization, the linear DMA fragment band corresponding 
to full length plasmid is isolated as pCVSVL-Bl. 

The cENA for AHF is rescued from the Example 5 cloning 
vectors by partial digest with Pst I. Hie partial digest is 
separated on gel electrophoresis, and the band corresponding to 
the nclecular weight of the full length cENA is isolated. 
pCVSVL-Bl is annealed with this ENA fragment, ligated with T4 
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ligase at 15°C, transfected into sslL strain HB101 and 
transfonnants selected for tetracycline resistance. The selected 
^ SSli transfonnants are grown in the presence of tetracycline „ 
Plasmid pCVSVL-Bla is recovered in conventional fashion. Proper 
orientation of the cENA in the plasmid may be determined in 
conventional fashion by assymetric digestion with an appropriate 

endonuclease (s) « 

•>„ ffi^ Hmsform rt ^* Selection and ftffPMftteattett 
Plasmid pCVSVL-Ala or pCVSVL-Bla and pAdD26SVpA#3 
(Kaufman et al., 2P, sit.) are mixed together (50 jg HP and .5 jjg 
pftdD265vpA *3) and precipitated by the addition of NaOAc pH 4o5 
to o3M and 2.5 volSo of ethanol. Precipitated DMA is allowed to 
air dry, is resuspended in 2X HEBSS (.5ml) and mixed vigorously 
with o25 fiL Cad 2 (.5ml) as described (Kaufman et al*, SE>° fiifc) . 
The calcium-phosphate-DNA precipitate is allowed to sit 30' at 
room temperature and applied to CHO D0KX-B1 cells (Chasin and 
Urtaub, 1980, available from Columbia university) . The growth 
and maintenance of these cells has been described (Kaufman et 
al. , flE. cii, Chasin and Urlaub 1980) . 

The BUKX-Bl cells are subcultured at 5xl05/l0cm dish 24 
hr. prior to transfection. After 30 minutes incubation at room 
temperature, 5ml of a media with 10% fetal calf serum is applied 
and*he cells are incubated at 37° for 4.5 hr. The media is then 
removed from the monolayer, 2ml of a-media with 10% fetal calf 
serum, 10ug/ml of thymidine, adenosine, and deoxyadenosine, and 
penicillin and streptomycin. Two days later the cells are 
subcultured 1:15 into a-media with 10% dialyzed fetal calf serum, 
and penicillin and streptomycin, but lacking nucleosides. Cells 
are then fed again with the same media after 4-5 days. 

Colonies appear 10-12 days after subculturing into 
selective media. Methotrexate (MTX) selection and detection of 
AHF gene and selection gene amplification are conducted in 
accordance with Axel et a., D.S. Patent 4,399,216, or Kaufman et 
al., cpo sit. 
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AHF yields may be improved by cotransfonning the 
permanent cell line EA.hy 926 (Edgell et al. , "Proc. Natl. Acad, 
Sci" ££: 3734-3737 (1983) in place of D0KX-B1 cells. In this case 
the cotransfonning selection gene should be the dominant ESFR 
gene disclosed by Baber et al, , "Somatic Cell Genetics" A: 499-508 
(1982) • Otherwise, the ootransf ormation and culture will be 
substantially as set forth elsewhere herein* 

3- Production of AHF 

AHF-producing CEO transformants selected in Section 2 cure 
maintained in seed culture using standard techniques, The 
culture is scaled up to 10 liters by cell culture in conventional 
media. This medium need not contain MIX and will not contain 
nucleosides so as to exclude selection gene revertants. 

The culture supernatant is monitored for clotting 
activity following standard assays for use with blood plasma 
samples (reduction in clotting time of Factor Vlll-def icient 
plasma) • The supernatant may be purified by conventional 
techniques such as polyethylene glycol and glycine precipitations 
(as are known for use in purifying AHP from blood plasma) in 
order to increase the sensitivity of the FACTOR VIII assays. 

>2hen FACTOR VIII activity has reached a peak the cells 
are separated f rem the culture medium by centrifugation. The 
Factor VIII is then recovered and purified by polyethylene glycol 
and glycine precipitations. Clotting activity is demonstrated in 
Factor VIII deficient plasma, 

4, Alterna tive Procedure for Production of AHF 

A full length cENA containing the entire AHF coding 
region was constructed frcm the axon in HH-25 described above, 
two cENA clones which overlapped the extreme 5 1 and 3 1 regions of 
that exon r and a third clone which overlapped the 3" cENA clone 
and continued past the termination oodon. This was constructed 
such that synthetic Sal I sites were placed just 5* to the 
initiator methionine and at a position 3* to the translational 
stop signal at the end of the AHF coding sequences. This allowed 
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the .placement into and excision f rem the polylinker Sal I site of 
PSP64. to Sal I fragment from this clone (pSP64-VIII> »as 
Purified and ligated to pCVSVL2. PCVSVL2 is a plasmid which is 
identical to the expression vector pCVSVL (Kaufman, et Mb2* 
rellBial., 2:1304 (1982) the disclosure of which in incorporated 
SttnbTref erence) , except that it has deleted the Pst I site 
located 3' of the SV40 polyadenylation sequence which is 
accomplished by conventional procedures and that it contains a 
duplication of the SV40 Ava II "D» fragment upstream from the 
adenovirus major late promoter (MLP) = pCVSvL2 was derived from 
pAdD26SvpA(l) (See Kaufman et aU, sissa) by adding Xbol linkers 
at each end of two SV40 Ava II ^ fragments, and inserting them 
into the xnol site in pftdD26SvEA<l), The Ava II T>- fragments 
are both inserted such that the SV40 late promoter is in the same 
orientation as the Ad2 major late promoter. For plasmids pCVSTL 
and pAdD26SvpAC3) , see Kaufman et. al., sjees. To insert the Sal 
I fragment of pSP64-VTII into pCVSVL, the unique Pst I site in 
PCVSVL2 was converted to a Sal I site by the procedure described 
in Rothstein et al., Uste&U^, 69:98 (1980) , and the Sal I 
fragment excised from pSP64-VHI was inserted to yield 
pCVSVL2-vTII. pC7SVL2-VIII contains the correct 5' to 3' 
orientation which allows transcription of the AHF coding sequence 
from the adenovirus MLP of the vector. pCVSVl2-VIII is on 
deposit at the American Type Culture Collection under Accession 

Number . » • 

The pCVSVL2-VTII was introduced into monkey ODS-7 cells 

(Mellon et al., £ell 27.279 (1981) using the DEAE dextran 
transfection protocol described by Sampayrac and Danna, BBS 
Un5n A n 78:7575 (1981) . The cells were cultured in serum-free 
medium which was assayed for AHF activity 3 days after 
transfectiono 

Factor VIII, C activity was determined by the chranogenxc 
substrate assay, described by Didisheim, SeiSDSS. 129:389 (1959), 
which confirmed that human Factor VIII expression had been 
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achieved at a level of about 0.05 units/ml. 
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CLAIMS: 

1. A DNA sequence coding for human factor VIII:C 
substantially free of other human genes. 

2. A cloned human factor VIII :C gene. 

3. a transformed host containing a gene for human factor 
VIII :C said host being selected from bacteria, yeasts, and 
mammalian c ell s. 

4. isolated DNA coding for human factor VIII :C, 
comprising a polydeoxyribonucleotide having the sequence: 
5'OGC AGC ITT CAG AAG AAA ACA OGA CAC 

TAT TTT ATT GCT GCA GTC GAG AGG 3' 

5. The DNA of claim 4, excised from human genomic DNA. 

6. The DNA of claim 4, linked directly or indirectly to 
DNA from a non-human source. 

7. A cloning vector comprising the DNA segment of claim 

8* An expression vector for human factor VIII :C f 
comprising the DNA segment of claim 4. 

9* A transformed microorganism containing the expression 
vector of claim 7. 

10. A transformed cell line containing the expression 
vector of claim 7. 
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11. A screening agent for identifying deoxyribonucleotide 
sequences and ribonucleotide sequences which encode for at least a 
portion of the human gene factor VIII :C, conprising a 
deoxyribonucleotide having at least a ten nucleotide portion of 
the sequence or its inverse complement: 

S'TTT CAG AAG SGA ACC CGA CAC TAT TIC 
ATT GCT GOG GTC GAG CAG CTC TGG GAT 
TAC GGC ATG AGC GAA TCC CCC OGG GOG 
CTA AGA AAC AGG 3 r 

12. flie ENA of claim 11 , excised from porcine genomic DNA. 

13. The ENA of claim 11 , linked to DNA frcm a non-porcine 

source. 

14. A cloning vector apprising the DNA segment of claim 

11. 



15. A DNA sequence coding for porcine factor VIII :C 
substantially free of other porcine genes. 

16. A cloned porcine factor VIII :C gene. 

17. A transformed host containing a gene for porcine 
factor VIII :C, said host being selected from bacteria, yeasts and 
raanmalian cells. 

18. Isolated DNA coding for porcine factor VIII :C, 
comprising a polydeoxyrifconucleotide having the sequence: 
5' ITT GAG AAG AGA ACC OGA CAC TAT TTC 

A3T GCT GOG GTG GAG CAG CTC TGG GAT 
TAC GGC ATG AGC GAA TCC CCC OGG GOG 
CTA AGA AAC AGG 3 1 
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19o An expression vector for porcine factor VIII sC, 
comprising the DNA segment of claim 18. 

20. A transformed microorganism containing the expression 
vector of claim 19 

21 o A method of isolating a gene fragment which encodes 
for at least a portion of a protein, comprising forming a genomic 
DNA library of an organism that produces the protein, forming at 
least one oligonucleotide probe whose sequence was selected solely 
based on the amino acid sequence contained in the protein, 
contacting the genomic library with the oligonucleotide under 
conditions favoring DNA/DNA hybridization, identifying the genomic 
DNA which hybridizes to the oligonucleotide, and isolating a 
segment of the genomic DNA which contains the 
oligonucleotide-hybridizing DNA. 

22. A method of in accordance with claim 21, wherein at 
least one oligonucleotide probe comprises a plurality of 
oligonucleotides which contain differing DNA sequences, each of 

' which corresponds in ENA sequence to the amino acid sequence. 

23. The method according to claim 22, wherein at least 
one'oligonucleotide probe contains a plurality of 
oligonucleotides, one of which contains each possible DNA sequence 
which corresponds with a selected amino acid sequence in the • 
protein. 

24. Hie method of claim 22, wherein the oligonucleotide 
probe has a length of from about 10 to 200 nucleotides. 

25. The method of claim 22, wherein the oligonucleotides 
have a length of from 10 to 20 nucleotides. 
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26. The method of claim 22 , wherein the oligonucleotides 
have a chain length of front about 30 to 200 nucleotides . 

27 • The method of claim 22, wherein the oligonucleotides 
have a chain length of about 40 to 50 nucleotides. 

28. A method in accordance with claim 22, wherein at 
least two oligonucleotide probes are formed corresponding in DNA 
sequence to amino acid sequences in the protein, and each 
oligonucleotide probe is contacted with the genomic DNA under 
hybridizing conditions. 

29. The method of claim 28, wherein one oligonucleotide 
probe comprises a plurality of oligonucleotides having a length of 
11 to 20 nucleotides, and another oligonucleotide probe conprises 
at "least one oligonucleotide having a length of 40 to 200 
nucleotides. 

30. The method of claim 28, wherein one nucleotide probe 
comprises at least one oligonucleotide having a length of 40 to 90 
nucleotides. 

31. A method of isolating a gene which encodes for human 
factor VIII :C, comprising screening a cRNA library with a probe 
which corresponds to at least a ten nucleotide sequence of the 
following sequence or its inverse complement: 

5'CGC AGC TTT CAA AAG AAA ACA CGA CAC 

TAT TTT ATT GCT GGA GIG GRG AGG 3 f 
constructing a cENA library frcm the raRNA which hybridizes with 
the probe, and forming a gene which encodes for human factor 
VTII:C by ligating AHF cDNA segments from the cDNA library. 

32. A method for producing AHF, cocprising . 

(a) preparing one or more oligonucleotide probes which 
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hybridize with human DNA which encodes AHF, or which hybridize 
with the complement of such 

(b) using at least one such probe to identify cells 
containing mRNA transcripts having nucleotide sequences 
corresponding to AHF; 

(c) preparing cDNA from the step (b) mKNA; 

(d) assembling the step Cc) cDNA into a replicable cDNA 
sequence encoding at least the amino acid sequence of mature human 
AHF; 

(e) cloning the step (d) sequence; 

(f ) transforming a parental cell which does not express 
AHF with the step (e) sequence; 

(g) culturing the step (f) transformed cell; and 

(h) recovering AHF f ran the culture 0 

33 o Ike method of claim 32 wherein the cell is a 
mammalian cello 

34. The method of claim 33 wherein the cell is 
cotransf ormed with a selection gene 0 

35 0 Substantially pure human factor VIII :C, made by the 
process of claim 32 o 

36 * The DNA sequence of claim 1, wherein said sequence 
comprises one or more DNA sequences selected from the DNA 
nucleotide sequence depicted in Figure 7 or one or more DNA 
sequences which hybridize to the DNA nucleotide sequence depicted 
in Figure 7 and which on expression codes for a polypeptide which 
exhibits factor VIII sC activity o 

37 o The DNA sequence of claim 1, wherein said sequence 
comprises the DNA sequence shown in Eigure 7o 
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38. The era sequence of claim 1, wherein said sequence 
codes for a polypeptide having the amino sequence shown in Figure 
7. 

39. A recxambinant ENA sequence which codes for a 
polypeptide which produces factor VIII :C activity, 

40- OSie ENA sequence of claim 39, wherein the polypeptide 
comprises at least a major portion of the amino acid sequence of 
Figure 7. 

41. A polypeptide expressed by ENA which exhibits human 
factor VIII :C activity comprising amino acid sequences selected 
f rem the amino acid sequence depicted in Figure 7 or amino acid 
sequences expressed by ENA sequences which hybridize to the DNA 
nucleotide sequence depicted in Figure 7. 

42. Hie polypeptide of claim 41 f wherein said polypeptide 
comp rises amino acids corresponding to a segment of the amino acid 
sequence of Figure 7 sufficient to provide factor VIII :C activity, 
substantially free of human fibrinogen and f ibronectin. 

43. The polypeptide of claim 41, wherein said polypeptide 
is substantially free from other human polypeptides. 

44. The polypeptide of claim 43, wherein sid other human 
polypeptides are human factor VIII :vWF, fibrinogen and 

f ibronectin. 

45. A vector comprising a ENA sequence coding for a 
number of amino acid groups in the sequence depicted in Figure 7 
sufficient to produce factor VIII :C activity. 

46. A transformed host containing exogenous DNA coding 
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for a number of amino acid groups in the sequence depicted in 
Figure 7 sufficient to produce factor VIII :C activity. 

47. The transformed host of claim 46, wherein the host is 
selected from bacteria, yeast, insect or mammalian cells. 

48. A method of making human factor VIII:C, comprising 
transforming a host with a DNA sequence which codes for a 
polypeptide having a number of amino acid groups depicted in 
Figure 7 sufficient to produce factor VIII ;C activity, and 
expressing polypeptides from that sequence. 

49. The method of claim 48, wherein the polypeptide has 
the amino acid sequence depicted in Figure 7. 

50. The method of claim 48, wherein the DNA segment has 
the sequence depicted in Figure 7. 

51. The method of claim 48, further comprising recovering 
human factor VIII:C from said transformant. 

52. A glycosylated polypeptide consisting essentially of 
a- number of amino acids in the sequence shown in Figure 7 
sufficient to provide factor VIII :C activity. 

53. a pharmaceutical preparation useful for therapeutic 
treatment of Hemophilia A comprising a sterile preparation of the 
polypeptide of claim 41. 

54. A method of treating Hemophilia A comprising 
administering an effictive dose of the polypeptide of claim 41. 
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FIG. 2 

A. Amines Ternfinus of 166Kd Polypeptide 
XIRRYYLCAVELSWDYRQSELLRELHVDTRFPA 

B. Fracyment of 166Kd Polypeptide . 
XVAKKHPKTWVHYISAEEEDWDYAPAVPSPSDRS/TYKSL 
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FIG.4 



















27 


AAA 


ACA 


CGG 


GGC 


ACC 


TGT 


TAA 


CCT 


GAA 


















• 54 


CAA 


AGT 


AAA 


TAG 


Aww 


TGG 


AAG 


uAC 


TCC 


CTC 


CAA 


GCT 


TCT 


GGG 


TCC 


CCC 


GAT 


GCC 


















108 


CAA 


AGA 


GTG 


GGA 


ATC 


CCT 


AGA 


GAA 


GTC 


ACC 


AAA 


AAG 


CAA 


AGC 


TCT 


CAG 


GAC 


GAA 


apa 


CAT 

W/\X 


PAT 
WA X 


W*A W 


TTT 


ACC 
nw w 


CCT 

WW X 


GGA 


162 
CCG 

WW V 


















189 


1 WA 


ppa 

WuA 


a a P 


PA A 


TP & 


TTP 
x 1 W 


a AT 
AA 1 


APP 


APP 


















216 


a a x 
AAA 


& a a 
AAA 


TGA 


app 
Abu 


apa 
ALA 


APP 
AGG 


r*r*a 
wGA 


PAP 
GAw 


PP A 

wwA 


















243 




APA 




ppp 

WWW 


W 1 VI 


PAP 


PA a 


PPA 


PPG 


AGG 






AAG 


PPT 

Gw 1 


PTP 

GIG 


PPP 
wGw 


TPP 
1 WW 


m 

AAA 


















297 


gpp 


tpp 


uu x 


WW x 


GCG 

WW w 


ACG 


GCA 

uWA 


TCA 

X W A 


GAC. 


















324 


OCA 


CAT 




CCT 


1 WW 


TAG 


TTT 


t Ta 
1J A 


www 


GGA 


GGA 


AHA 


CAA 


AAT 


GGA 


CTA 

W X rt 


TGA 


l& 


















378 


TAT 


PTT 


W X-A 


APT 
AW 1 


«nA 


APP 


a a p 
AAO 


PPA 


GAA 


















405 


GAT 


___ 

TTT 




ATT 
AX X 


TAC 
X aw 


GGT 


PAG 

OAO 


PAT 
oAx 


GAA 


















432 


AAV 


pan 

WAw 


PAP 


CCT 
WW X 


CCC 

WOW 


AGC 


TTT 


PAP 
WAV? 


A AG 


















4 59 


AGA 


ACC 


CGA 


CAC 


TAT 


TTC 


ATT 


GCT 


GCG 


















486 


GTG 


GAG 


CAG 


CTC 


TGG 


GAT 


TAC 


GGG 


ATG 


















513 


AGC 


GAA 


TCC 


CCC 


CGG 


GCG 


CTA 


AGA 


AAC 


















540 


AGG 


TAT 


GGC 


TAC 


GTT 


GGC 


TAC 


TCC 


TCT 


















567 


GTC 


CTA 


CCC * 


TGG 


GGA 


CCT 


TTG 


TCT 


TGA 


















594 


GCA 


GGT 


GCC 


GAA 


GCC 


ATG 


GGA 


AAG 


GCA 


















621 


CAA 


GCA 


GTC 


TGG 


GGG 


TGG 


AGA 


GGC 


CAC 


















648 


AGT 


GGG 


AGG 


ATG 


TGC 


TTG 


TTG 


GGG 


AGC 


















675 


ACA 


GCG 


TGG 


TCG 


GGC 


AGG 


GAA 


GAG 


CAG 


ACC 


GAC 


CTG 


AGG 


AGA 


G 









J Indicates an ambiguity, in accordance with the Stanford code. 
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RG.5 

(250) 9 lg 27 

TGG AAG GCT GTG CGC TCC AAA GCC TCC GGT CCT 

36 45 63 

GCG ACG GCA TCA GAG GGA CAT AAG CCT TCC TAC 

72 81 90 99 

TTT T J A CCC GGA GGA AGA CAA AAT GGA CTA TGA 

108 117 126 

TGA TAT CTT CTA ACT GAA ACG AAG GGA GAA GAT 

135 144 153 162 

TTT GAC ATT TAC GGT GAG GAT GAA AAT CAG GAC 

171 180 189 198 

CCT CGC AGC TTT CAG AAG AGA ACC CGA CAC . TAT 
RSTQKRTRHY 

207 215 225 " 

TTC ATT GCT GCG GTG GAG CAG CTC [TGG GAT TAC 
p lAAVEQLWDY 

"4 243 252 261 

GCG ATG] ACC GAA TCC CCC CGG GCG CTA AGA 
GMSESPRALR 

267 

AAC AGG TAT GGC TAC GTT GGC TAC TCC TCT 
N R 



GTC CTA CCC TGG GGA CCT TTC TCT TGA GCA 
GGT GCC GAA GCC ATG GCA AAG GCA CAA GCA 



GTC TGG GGG TGG AGA { 3 1 ) 
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FIG. 6 

HUMAN SEQ. DATA (Complement of 15 mer desired sequence) 
5* GAC ATT TAT GAT GAG GAT GAA 

ATT CAG AGC CCC CGC AGC TTT 
CAG AAG AAA ACA CGA. CAC TAT 
TTT ATT GCT GCA GTG GAG AGG 



v/ipo 
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I 



5 « GAATTCCCCACTGGCTAACTTCCTTAAATCCTCTGCAAACAAATTCCGACTTTTCATTAAATCACAAATT 
TTACTTTTTTCCCCTCCTGCGAGCTAAACATATTrTA 

MET Cln II* Clu Leu Ser The Cys Phe ?he Leu Cys Leu Leu Arg Phe Cys Phe 18 

All C*l ATA GAG CTC TCC ACC TGC TTC TTT CTG TCC ,CTT TTC CCA TTC TCC TXT 

Ser Ala Thr Ar* Are Tyr Tyr Leu Cly Ala Val Glu Lei. Sor Trp Asp Tyr MET 36 

IgI CCC ACC ACA 111 TAG TAC CTC CCT CCA CTC CAA CTC TCA TCC CAC TAT ATC 

Gin Ser Asd Leu Cly Glu Leu Pro Val Asp Ala Arg Phe Pro Pro Arg Val Pro 54 

CAA ACT GAT CTC CCT -GAG CTC CCT CTC CAC CCA ACA TTT CCT CCT AGA CTC CCA 

Lys Ser Phe Pro Phe Asn Thr Ser Val Val Tyr Lys Lys Thr Leu Pha Val Clu 72 

IaH TCT TTT CCA TTC AAC ACC . TCA CTC CTC TAC AAA AAC ACT CTG TTT CTA CAA 



Phe Thr Val His Leu Phe Asn He Ala Lys 

TTC ACG GTT CAC CTT TTC AAC ATC CCT AAC 

Leu Cly Pro Thr He Cln Ala Glu Val Tyr 

£TA CGT CCT ACC ATC CAG CCT CAC GTT TAT 

Asn MET Ala Ser His Pro Val Ser Leu His 

.AAC ATG CCT TCC CAT CCT CTC ACT CTT CAT 

Ala Ser Clu Cly Ala Clu Tyr Asp Asp Gin 

CCT TCT GAG CCA CCT CAA TAT CAT CAT CAC 

Asp Lys Val Phe Pro Gly Gly Ser His Thr 

GAT AAA GTC TTC CCT CGT CCA ACC CAT ACA 

Asn Gly Pro MET Ala Ser Asp Pro Leu Cys 

AAT CGT CCA ATG CCC TCT CAC CCA CTC TCC 
*** 

Val Asp Leu Val Lys Asp Leu Asn Ser Cly 

CTG CAC CTC CTA AAA CAC TTC AAT TCA CGC 

Arg Glu Gly Ser Leu Ala Lys Clu Lys Thr 

ACA CAA CGC AGT CTC GCC AAG CAA AAC ACA 

Leu Phe Ala Val Phe Asp Clu Gly Lys Ser 

CTT TTT CCT- CTA TTT CAT CAA CCC AAA ACT 



Pro Arg Pro Pro Trp MET Gly Leu 
CCA ACG CCA CCC TCC ATC CCT CTC 



90 



Asp Thr 

CAT ACA 

Ala Vai 

CCT GTT 

Thr Ser 

ACC ACT 

lyr Val 

TAT CTC 



Val Vai 

CTG CTC 

Gly Val 

CCT GTA 

Gin Arg 

CAA ACC 

Trp CJn 

TCC CAC 



He Thr Leu Lys 108 
ATT ACA CTT AAC 



Ser Tyr Trp Lys 

TCC TAC TGG AAA 

Clu Lys Glu Asp 

GAG AAA GAA CAT 

Val Leu Lys Glu 

CTC CTC AAA GAG 



126 



144 



162 



Leu MET Cln Asp Arg Asp Ala 

TTC ATG CAG CAT ACC CAT CCT 

Val Asn Cly Tyr Val Asn Arg 

CTC AAT CCT TAT GTA AAC ACG 

Scr Vai Tyr Trp His Val He 

TCA CTC TAT TGG CAT CTG ATT 

Phe Lue Glu Gly His Thr Phe 

TTC CTC CAA CCT CAC ACA TTT 



Ala Ser Ala 

CCA TCT CCT 

Ser Leu Pro 

TCT CTG CCA 

Cly MET Cly 

CCA ATC GGC 

Leu Val Arg 

CTT CTC ACC 



Leu Thr Tyr Ser Tyr Leu Ser His 

CTT ACC TAC TCA TAT CTT TCT CAT 

He Gly Ala Leu Leu Val Cys 198 

ATT GCA CCC CTA CTA CTA TGT 

Thr Leu His Lys Phe He Leu 216 

ACC TTC CAC AAA TTT ATA CTA 

Asn Ser 234 

AAC TCC 

His Thr 252 

CAC ACA 

Arg Lys 270 

ACG AAA 

Ser lie 288 

TCA ATA 

Leu Glut *06 

TTC CAA 



Leu 
CTC 

Gin 
CAC 

Trp 
TCC 

CCG 

Cly 

CCT 

Thr 
ACC 

Asn 
AAC 



His Ser Clu Thr Lys 

CAC TCA GAA ACA AAC 

Ala Trp Pro Lys MET 

CCC TGG CCT AAA ATC 

Leu Lie Gly Cys Hia 

CTC ATT CCA TCC CAC 

Thr Pro Glu VaL His 

ACT CCT GAA GTG CAC 

Hi* Mrg Cl/r Ala Scr 

CAT CCC CAC CCC TCC 
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lie 
ATC 


Ser 
TCG 


Pro 
CCA 


lie 
ATA 


Thr 
ACT 


Phe 
TTC 


Phe* 
TTT 


Leu 
CTA 


Leu 
CTC 


Phe 
TTT 


Cys 
TCT 


HAS 

CAT 


Val 
GTC 


Lys 
AAA 


Val 
GTA 


Asp 
GAC 


Ser 

AGC 


Cys 
TGT 


Glu 
GAA 


Ala- 
GCG 


Glu 
■GAA 


Asp 
GAC 


Tyr 
TAT 


Asp 
GAT 


Phe 
TTT 


Asp 
GAT 


Asp 
GAT 


Asp 
GAC 


Asn 
AAC 


Ser 
TCT 


His 
CAT 


Pro 
CCT 


Lys 
AAA 


Thr 
ACT 


Trp 
TGG 


Val 
GTA 


Ala 
GCT 


Pro 
CCC 


Leu 
TTA 


Val 
GTC 


Leu 
CTC 


Ala 
GCC 


Asn 
AAT 


Gly 
CCC 


Pro 
CCT 


Gin 
CAG 


Arg 

CGG 


He 
ATT 


Thr 
ACA 


Asp 
CAT 


Clu 
GAA 


Thr 
ACC 


Phe 
TTT 


Lys 
AAG 


Gly 
CGA 


Pro 
CCT 


Leu 
TTA 


Leu 
CTT 


Tyr 
TAT 


Gly 
CCC 


Gin 
CAA 


Ala 

GCA 


Ser 
ACC 


Arg 
AGA 


Pro 
CCA 


Tyr 
TAT 


Leu 
TTG 


Tyr 
TAT 


Ser 
TCA 


Arg 
AGG 


Arg 
ACA 


Leu 
TTA 


Leu 
CTG 


Pro 

CCA 


Gly 
GGA 


Glu 
GAA 


He 
ATA 


Phe 
TTC 


Thr 
ACT 


Lys 
AAA 


Ser 
TCA 


Asp 
GAT 


Pro 
CCT 


Arg 

CGG 


Glu 
GAG 


Arg 
AGA 


Asp 
GAT 


Leu 
CTA 


Ala 
GCT 


Ser 
TCA 


Ser 
TCT 


Val 
CTA 


Asp 
CAT 


Gin 
CAA 


Arg 
ACA 


Gly 
CGA 


Phe 

ttt 


Ser 
TCT 


Val 
CTA 


Phe 
TTT 


Asp 
GAT 


Clu 
CAC 


Phe 
TTT 


Lett 
CTC 


Pro 
CCC 


Asn 
AAT 


Pro 
CCA 


Ala 
CCT 


Asn 
AAC 


He 
ATC 


MET 
ATC 


His 
CAC 


Ser 
ACC 


He 
ATC 
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Leu Thr Ala Gin Thr Leu Leu 

CTT ACT CCT CAA ACA CTC TTC 

He Ser Ser His Gin His Asp 

ATC TCT TCC CAC CAA CAT GAT 

Pro Glu Glu Pro Gin I.eu Arg 

CCA GAG GAA CCC CAA CTA CCA 

Asp Asp Leu Thr Asp Ser Glu 

CAT CAT CTT ACT CAT TCT CAA 

Pro Ser Phe He Cln He Arg 

CCT TCC TTT ATC CAA ATT CCC 

His Tyr He Ala Ala Glu Glu 

CAT TAC ATT GCT GCT GAA CAC 

Pro Asp Asp Arg Ser Tyr Lys 

CCC CAT GAC AGA AGT TAT AAA 

Gly Arg Lys Tyr Lys Lys Val 

CCT AGG AAC TAC AAA AAA CTC 

Thr Arg Glu Ala He Cln His 

ACT CGT GAA CCT ATT CAG CAT 

Clu Val Cly Asp Thr Leu Leu 

CAA GTT CCA GAC ACA CTC TTC 

Asn He Tyr Pro His Gly He 

AAC ATC TAC CCT CAC CGA ATC 

Pro Lys Gly Val Lys His Leu 

CCA AAA GGT GTA AAA CAT TTC 

Lys Tyr Lys Trp Thr Val Thr 

AAA TAT AAA TGG ACA CTG ACT 

Cys Leu Thr - Arg Tyr Tyr Ser 

TGC CTG ACC CGC TAT TAC TCT 

Gly Leu He Gly Pro Leu Leu 

GCA CTC ATT CGC CCT CTC CTC 

Asn Gin He HLT S«r Asp Lys 

AAC CAG ATA ATC TCA CAC AAC 

Asn Arg Ser Trp Tyr Leu Thr 

AAC CGA AGC TGC TAC CTC ACA 

Gly Val Gin Leu Glu Asp Pro 

CGA GTC CAG CTT CAC CAT CCA 

Asn Cly Tyr Val Fhe A*p Ser 

AAT CCC TAT- CTT TTT CAT AGT 



2 



MET 
ATC 


Asp 
CAC 


Leu 
CTT 


Gly 

(;i:a 


Gin 
CAG 


324 


Gly 
CGC 


MET 
ATG 


Clu 
GAA 


Ala 
GCT 


Tyr 
TAT 


342 


MET 
ATC 


Lys 
AAA 


Asn 
AAT 


Asn 
AAT 


Glu 
CAA 


360 


MET 
ATG 


Asp 
GAT 


Val 
GTG 


Val 
GTC 


Arg 
AGG 


378 


Ser 
TCA 


Val 
GTT 


Ala 

GCC 


Lys 
AAG 


Lys 
AAG 


396 


Glu 
GAG 


Asp 
GAC 


Trp 
TCC 


Asp 
GAC 


Tyr 
TAT 


414 


Ser 
AGT 


Cln 
CAA 


Tyr 
TAT 


Leu 
TTG 


Asn 
AAC 


432 


Arg 
CCA 


Phe 
TTT 


MKT 
ATG 


Ala 
CCA 


Tyr 
TAC 


450 


Clu 
CAA 


Ser 
TCA 


Gly 

GGA 


He 
ATC 


Leu 
TTC 


468 


He 
ATT 


He 
ATA 


Phe 
TTT 


Lys 
AAG 


Asn 
AAT 


486 


Thr 
ACT 


Asp 
CAT 


Val 
CTC 


Arg 
CGT 


Pro 
CCT 


504 


Lys 
AAG 


Asp 
GAT 


Phe 
TTT 


Pro 
CCA 


lie 
ATT 


522 


Val 
GTA 


Glu 
GAA 


Asp 
GAT 


Gly 
GGG 


Pro 
CCA 


540 


AGT 


me 
TTC 


V <1 J. 

GTT 


AAT 


MET 
ATG 


558 


He 
ATC 


Cys 
TCC 


Tyr 
TAC 


Lys 
AAA 


Clu 
GAA 


576 


Arg 
AGG 


Asn 
AAT 


Val 
GTC 


He 
ATC 


Leu 
CTG 


594 


Giu 
CAG 


Asn 
AAT 


He 
ATA 


Cln 
CAA 


Arg 

CGC 


612 


Clu 
CAG 


Phe 
TTC 


Gin 
CAA 


Ala 
CCC 


Ser 
TCC 


630 


Leu 
TTG 


Gin 
CAG 


Leu 
TTC 


Scr 
TCA 


V*T 
GTT 


648 
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TCT 


Leu 
TTC 


His 
CAT 


Clu 
CAC 


Val 
CTC 


Ala 
CCA 


Tyr 
TAC 


Trp 
TCG 


Tyr 
TAC 


lie 
ATT 


Leu 
CTA 


S«r 
ACC 


lie 
ATT 


Cly 
GGA 


AJa 
GCA 


Gin 
CAG 


Thr 
ACT 


Asp 
CAC 


666 


Fhe 
TTC 


. Leu 
CTT 


Ser 
TCT 


Val 
CTC 


Phe 
TTC 


Phe 
TTC 


Ser 
TCT 


Gly 
CCA 


Tyr 
TAT 


Thr 
ACC 


Phe 
TTC 


Lys 
AAA 


His 
CAC 


Lys 
AAA 


MET 
ATG 


Val 
GTC 


Tyr 
TAT 


Glu 
GAA 


684 


Asp 
GAC 


Thr 
ACA 


Leu 
CTC 


Thr 
ACC 


Leu 
CTA 


Phe 
TTC 


Pro 
CCA 


Phe 
TTC 


Ser 
TCA 


Cly 
GGA 


Glu 
CAA 


Thr 
ACT 


Val 
GTC 


Ptic 
TTC 


MET 
ATG 


Ser 
TCG 


MET 
ATG 


Glu 
GAA 


702 


Asn 
AAC 


Pro 
CCA 


Cly 
CCT 


Leu 
CTA 


Trp 
TCC 


lie 
ATT 


Leu 
CTG 


Gly 
GGG 


Cys 
TCC 


His 
CAC 


Asn 
AAC 


Ser 
TCA 


Asp 
CAC 


Phe 
TTT 


Arg 
CCG 


Asn 
AAC 


Arg 
AGA 


Gly 
CCC 


720 


MET 
ATC 


Thr 
ACC 


Ala 
CCC 


Leu 
TTA 


Leu 
CTG 


Lys 
AAC 


Val 
GTT 


Ser 
TCT 


Ser 
AGT 


Cys 
TGT 


Asp 
CAC 


Lys 
AAG 


Asn 
AAC 


Thr 
ACT 


Cly 
GGT 


Asp 

GAT 


Tyr 
TAT 


Tyr 
TAC 


738 


Glu 
CAC 


App 
CAC 


Ser 
ACT 


Tyr 
TAT 


Clu 
CAA 


Asp 
CAT 


lie 
ATT 


Ser 
TCA 


Ala 
GCA 


Tyr 
TAC 


Leu 
TTC 


Leu 
CTG 


Ser 
ACT 


Lys 
AAA 


Asn 
AAC 


Asn 
AAT 


Ala 
CCC 


lie 
ATT 


756 


Glu 
GAA 


Pro 
CCA 


Arg 
ACA 


Ser 
ACC 


Phe 
TTC 


Ser 
TCC 


Gin 
CAC 


Asa 
AAT 


Ser 
TCA 


Arg 
ACA 


His 
CAC 


Pro 
CCT 


Ser 
AGC 


Thr 
ACT 


Arg 
AGC 


Gin 
CAA 


Lys 
AAC 


Gin 
CAA 


774 


Phe 
TTT 


Asn 
AAT 


Ala 
CCC 


Thr 
ACC 


Thr 

ACA 


lie 
ATT 


Pro 
CCA 


Clu 
CAA 


Asn 
AAT 


Asp 
CAC 


He 
ATA 


Clu 
CAC 


Lys 
AAG 


Thr 
ACT 


Asp 
CAC 


Pro 
CCT 


Trp 
TUG 


Phe 
TTT 


792 


Ala 
GCA 


His 
CAC 


Arg 
ACA 


Thr 
ACA 


Pre 
CCT 


MET 
ATC 


Pro 
CCT 


Lys 
AAA 


lie 
ATA 


Cln 
CAA 


Asn 
AAT 


Val 
CTC 


Scr 
TCC 


Ser 
TCT 


Ser 
ACT 


Asp 
CAT 


Leu 
TTG 


Leu 
TTG 


810 


MET 
ATG 


Leu 
CTC 


Leu 
TTG 


Arg 
CGA 


Cln 
CAG 


Ser 
ACT 


Pro 
CCT 


Thr 
ACT 


Pro 

CCA 


His 
CAT 


Gly 
CCG 


Leu 
CTA 


Ser 
TCC 


Leu 
TTA 


Ser 
TCT 


Asp 
GAT 


Leu 
CTC 


Gin 
CAA 


828 


Clu 
CAA 


AJa 
GCC 


Lys 
AAA 


Tyr 
TAT 


Clu 
CAC 


Thr 
ACT 


Phe 
TTT 


Ser 
TCT 


Asp 
GAT 


Asp 
GAT 


Pro 
CCA 


Ser 
TCA 


Pro 
CCT 


Gly 
GGA 


Ala 
GCA 


Tie 
ATA 


Asp 
CAC 


Ser 
AGT 


846 


Asn 
AAT 


Asn 
AAC 


Ser 
ACC 


Leu 
CTG 


Ser 
TCT 


Glu 
CAA 


MET 
ATC 


Thr 
ACA 


His 
CAC 


Phe 
TTC 


Arg 
ACC 


Pro 
CCA 


Cln 
CAG 


Leu 
CTC 


His 
CAT 


His 
CAC 


Scr 
ACT 


Cly 
CGG 


864 


Asp 
CAC 


MET 
ATG 


Val 
GTA 


Phe 
TTT 


Thr 
ACC 


Pro 
CCT 


Clu 
GAG 


Ser 
TCA 


Glv 
CGC 


Leu 
CTC 


Cln 
CAA 


Leu 
TTA 


Arg 
AGA 


Leu* 
TTA 


Asn 
AAT 


Glu 
CAG 


Lys 
AAA 


Leu 
CTG 


882 


Cly 
GGC 


Thr 
ACA 


Thr 
ACT 


Ala 
CCA 


Ala 
CCA 


Thr 
ACA 


Glu 
GAC 


Leu 
TTC 


Lys 
AAC 


Lys 
AAA 


Leu 
CTT 


ASp 

CAT 


l ne 
TTC 


uys 
AAA 


val 
CTT 


TCT 


oer 
ACT 


tut 

ACA 


900 


Ser 
TCA 


Asn 
AAT 


Asn 
AAT 


Leu 
CTC 


lie 
ATT 


Ser 
TCA 


Thr 
ACA 


lie 
ATT 


Tro 
CCA 


Ser 
TCA 


Asp 
CAC 


Asn 
AAT 


Leu 
TTG 


Ala 
CCA 


Ala 
CCA 


Giy 
CCT 


Thr 
ACT 


Asp 
GAT 


918 


Asn 
AAT 


Thr 
ACA 


Ser 
AGT 


Scr 
TCC 


Leu 
TTA 


Gly 
GGA 


Pro 
CCC 


Fro 
CCA 


Ser 
ACT 


MET 
ATC 


Pro 
CCA 


Val 
GTT 


His 
CAT 


Tyr 
TAT 


Asp 
GAT 


Ser 
AGT 


Cln 
CAA 


Leu 
TTA 


936 


Asp 
CAT 


Thr 
ACC 


Thr 
ACT 


Leu 
CTA 


Phe 
TTT 


Gly 
CGC 


Lys 
AAA 


Lys 
AAC 


Scr 
TCA 


Ser 
TCT 


Pro 
CCC 


Leu 
CTT 


Thr 
ACT 


Clu 
GAC 


Ser 
TCT 


Cly 
GGT 


Gly 

CCA 


Pro 
CCT 


9 54 


Leu 
CTG 


Ser 
AGC 


Leu 
TTG 


Ser 
ACT 


Clu 
CAA 


Clu 
GAA 


Asn 
AAT 


Asn 
AAT 


Asp 
CAT 


Ser 
TCA 


Lys 
AAG 


Leu 
TTG 


Leu 
TTA 


Clu 
CAA 


Ser 
TCA 


Gly 
GC'X 


Leu 
TTA 


MET 
ATC 


972 


Asn 
AAT 


Ser 
ACC 


Gin 
CAA 


Glu 
CAA 


Ser 
AGT 


Ser 
TCA 


Trp 
TGG 


Cly 
GCA 


Ly« 
AAA 


Asn 
AAT 


Val 
CTA 


Ser 
TCC 


Ser 
TCA 


Thr 
ACA 


GLu. 
GAG 


Ser 
ACT 


Gly 
CCT 


Ar& 
ACC 


990 
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Leu 
TTA 

Leu 
TTA 

Ala 
CCA 

Pro 
CCA 

Pro 
CCT 

Asa 
AAT 

Lys 
AAA 

Lys 
AAG 

Asm 
AAC 

Pro 
CCA 

Val 
GTA 

Ser 
ACC 

His 
CAC 

Gin - 
CAA 

MET 
ATC 

Cly 
CGG 

Arg 
ACA 

Glu 
CAA 

Arg 
ACG 



Phe 
TTT 

Phe 
TTC 

Tlir 
ACT 

Ser 
TCA 

Leu 
TTG 

His 
CAT 

Lys 
AAA 

KET 
ATG 

Ser 
TCT 

Glu 
CAA 

Cly 
CCA 

Ser 
AGC 

A sin 
AAT 

Glu 
GAG 

Lys 
AAC 

Ala 
GCA 

Thr 
ACA 

Gly 
GGC 

He 
ATA 



Lys 
AAA 

Lys 
AAA 

Asn 
AAT 

Vai 
GTC 

He 
ATT 

MET 
ATG 

Glu 
GAG 

Leu 
CTA 

Leu 
CTG 

Lys 
AAA 

Ly3 
AAG 

Arg 
AGA 

Cla 
CAA 

Asn 
AAT 

Asn 
AAC 

Tyr 
TAT 

Lys 
AAC 

Leu 
TTG 

Ser 
TCT 



Gly Lys Arg Ala His Gly 

CGG AAA AGA GCT CAT GCA 

Val Scr lie Ser Leu Leu 

GTT AGC ATC TCT TTC TTA 

Arg Lys Thr His He Asp 

AGA AAG ACT CAC. ATT GAT 

Trp Gin Asn Lie Leu Glu 

TCG CAA AAT ATA TTA GAA 

His Asp Arg MET Leu MET 

CAT GAC AGA ATG CTT ATG 

Ser Asn Lys Thr Thr Ser 

TCA AAT AAA ACT ACT TCA 

Gly Pro He Pro Pro Asp 

CGC CCC ATT CCA CCA CAT 

Phe Leu Pro Glu Ser Ala 

TTC TTG CCA GAA TCA GCA 

Asn Ser Cly Gin Cly Pro 

AAC TCT GCG CAA GGC CCC 

Ser Val Ciu Gly Gin Asn 

TCT CTG GAA GGT CAC AAT 

Gly Glu Phe Thr Lys Asp 

GGT GAA TTT ACA AAC CAC 

Asn Leu Phe Leu Thr Asn 

AAC CTA TTT CTT ACT AAC 

Glu Lys Lys He Gin Clu 

CAA AAA AAA ATT CAG GAA 

Val Val Leu Pro Gin He 

GTA GTT TTG CCT CAG ATA 

Leu Phe Leu Leu Ser Thr 

CTT TTC TAA CTG ACC ACT 

Ala Pro V,il Leu Gin .Asp 

GCT CCA CTA CTT CAA GAT 

Lys His Thr Ala His Phe 

AAA CAC ACA GCT CAT TTC 

Cly Asn Gin Thr Lys Cln 

CGA AAT CAA ACC AAC CAA 

Pro Asn Thr Ser Cln Cln 

CCT AAT ACA ACC CAG CAC 



Pro Ala Leu Leu Thr Lys Anp Asn Ala 1 ' 008 

CCT GCT TTC. TTG ACT AAA GAT AAT CCC 

Lys Thr Asn Lys Thr Scr Asn Asn Ser 1 ' 026 

AAG ACA AAC AAA ACT TCC AAT AAT TCA 



Gly Pro Ser Leu Leu He Glu Asn 

CGC CCA TCA TTA TTA ATT GAG AAT 

Ser Asp Thr Clu Phe Lys Lys Val 

ACT GAC ACT GAG TTT AAA AAA GTG 

Asp Lys Asn Ala Thr Ala Leu Arg 

GAC AAA AAT GCT ACA GCT TTC AGG 

Ser Lys Asn MET Glu MET Val Gin 

TCA AAA ACC ATG CAA ATC GTC CAA 

Ala Gin Asn Pro Asp MET Ser Phe 

CCA CAA AAT CCA GAT ATG TCG TTC 

Arg Trp He Gin Arg Thr His Cly 

AGC TCG ATA CAA ACG ACT CAT CCA 

Sor Pro Lys Gin Leu Val Ser Leu 

ACT CCA AAC CAA TTA GTA TCC TTA 

Phe Leu Ser Glu Lys Asn Lys Vai 

TTC TTG TCT CAC AAA AAC AAA tTC 

Val Gly Leu Lys Glu HET Val Phe 

CTA CGA CTC AAA GAG ATG CTT TTT 

Leu Asp Asn Leu His Clu Asn Asn 

TTC GAT AAT TTA CAT GAA AAT AAT 

Glu lie Glu Lys Lys - Glu Thr Leu 

CAA ATA GAA AAG AAG GAA ACA TTA 

His Thr Val Thr* Cly Thr Lys Asn 

CAT ACA CTG ACT GGC ACT AAG AAT 

Arg Cln Asn Val Clu Gly Ser Tyr 

ACC CAA AAT GTA CAA GCT TCA TAT 



Serl,044 
AGT 

Thr 1/062 
ACA 

Leu I' 080 
CTA 

Cinl'O* 8 
CAG 

Phe LI" 
TTT 

Lys 1,134 
AAG 

Gly 1,152 
GCA 

Val 1,170' 
CTA 

Pro 1,188 
CCA 

Thr 1,206 
ACA 

lie 1,224 
ATC 

Phe 1,242 
TTC 

Clu 1,260 
GAC 



Phe Arg Ser Leu Asn Asp Scr Thr Asn 1/278 

TTT AGG TCA TTA AAT CAT TCA ACA AAT 

Ser Lys Lys Gly Clu Glu Clu Asn Leu 1,296 

TCA AAA AAA CCC GAG CAA CAA AAC TTG 

He Val Glu Lys Tyr Ala Cys Thr Thr 1,314 

ATT GTA CAG AAA TAT CCA TCC ACC ACA 

Asn Phe VaL Thr Gin Ar* Ser Lys. Arg. 1/332 

AAT TTT CTC ACG CAA CCT ACT AAC ACTA 
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Ala 

J CCT 


Leu 
TTG 


Lys 
AAA 


(Un 
CAA 


Phe 
TTC 


Arg 
AGA 


Leu 

CTC 


Pro 
CCA 


Leu 
CTA 


Glu 
GAA 


Clu 
CAA 


Thr 
ACA 


Glu 
CAA 


Leu 
CTT 


Glu 
CAA 


Lys 
AAA 


Arg 
AGC 


Tie 
ATA 


1,350 


Ile* 
AXT 


Val 
CTG 


Asp 
GAT 


Asp 
GAC 


Thr 
ACC 


Ser 
TCA 


Thr 
ACC 


Cln 
CAC 


Trp 
TGC 


Ser 
TCC 


Lys 
AAA 


Asti 
AAC 


MKT 
ATC 


Lys 
AAA 


His 
CAT 


Leu 
TTC 


Thr 
ACC 


Pro- 
CCG 


1,368 


Ser 
AGC 


Thr 
ACC 


Leu 
CTC 


Thr 
ACA 


Gin 
CAC 


Tic 
ATA 


Asp 
CAC 


Tyr 
•TAC 


Asn 
AAT 


Glu 
CAC 


Lys 
AAG 


Glu 
CAC 


Lys 
AAA 


Gly 
GCG 


Ala 
CCC 


Tie 
ATT 


Thr 
ACT 


Cln 
CAC 


1,386 


Ser 
TCT 


Pro 
CCC 


Leu 

. TEA 


Ser 
TCA 


Asp 
CAT 


Cys 
TGC 


Leu 
CTT 


Thr 
ACG 


Arg 
AGG 


Ser 
ACT 


His 
CAT 


Ser 
AGC 


He 
ATC 


Pro 
CCT 


Cln 
CAA 


Ala 
GCA 


Asn 
AAT 


Arg 
AGA 


1,404 


Ser 
TCT 


Pro 
CCA 


Leu 
TTA 


Pro 
CCC 


lie 
ATT 


Ala 
GCA 


Lys 
AAG 


Val 
GTA 


Ser 
TCA 


Ser 
TCA 


P!ie 
TTT 


Pro 
CCA 


Ser 
TCT 


lie 
ATT 


Arg 
AGA 


Pro 
CCT 


He 
ATA 


Tyr 
TAT 


1,422 


Leu 
CTG 


Thr 
ACC 


Arg 
AGG 


Val 
GTC 


Leu 
CTA 


Phe 
TTC 


Gin 
CAA 


Asp 
GAC 


Asn 
AAC 


Ser 
TCT 


Ser 
TCT 


His 
CAT 


Leu 
CTT 


Pro 
CCA 


Ala 
CCA 


Ala 
GCA 


Scr 
TCT 


Tyr 
TAT 


1,440 


Arg 
ACA 


Lys 
AAC 


Lys 
AAA 


Asp 
GAT 


Ser 
TCT 


Cly 
CGG 


Val 
CTC 


Gin 
CAA 


Glu 
GAA 


Ser 
AGC 


Ser 
ACT 


His 
CAT 


Phe 
TTC 


Leu 
TTA 


Gin 
CAA 


Cly 
CCA 


Ala 
GCC 


Lys 
AAA 


1,456 


Cys 
AAA 


Asn 
AAT 


Asn 
AAC 


Leu 
CTT 


Ser 
TCT 


Leu 
TTA 


Ala 
GCC 


lie 
ATT 


Leu 
CTA 


Thr 
ACC 


Leu 
TTC 


Clu 
CAC 


MET 
ATC 


Thr 
ACT 


Cly 
CCT 


Asp 
CAT 


Cln 
CAA 


Arg 
ACA 


1,476 


Clu 
GAG 


Val 
GTT 


Cly 
OGC 


Ser 
TCC 


Leu 
CTG 


Gly 
GGC 


Thr 
ACA 


Ser 
AGT 


Ala 
GCC 


Thr 
ACA 


Asn 
AAT 


Scr 
TCA 


Val 
CTC 


Thr 
ACA 


Tyr 
TAC 


Lys 
AAC 


Lys 
AAA 


Val 
CTT 


1,494 


Clu 
GAG 


Asn 
AAC 


Thr 
ACT 


Val 
CTT 


Leu 
CTC 


Pro 
CCG 


Lys 
AAA 


Pro 
CCA 


Asp 
GAC 


Leu 
TTG 


Pro 
CCC 


Lys- 
AAA 


ITir 
ACA 


Ser 
TCT 


Gly 
CGC 


Lys 
AAA 


Val 
CTT 


Clu 
CAA 


1,512 


Leu 

• TTG 


Leu 
CTT 


Pro 
CCA 


Lys 
AAA 


Val 
GTT 


His 
CAC 


lie 
ATT 


Tyr 
TAT 


Glu 
CAC 


Lys 
AAG 


Asp 
CAC 


Leu 
CTA 


Phe 
TTC 


Pro 
CCT 


Thr 
ACC 


Clu 
GAA 


Thr 
ACT 


Ser 
AGC 


1,530 


Asn 
AAT 


Gly 
CGG 


Ser 
TCT 


Pro 
CCT 


Gly 
CGC 


His 
CAT 


Leu 
CTG 


Asp 

GAT 


Leu 
CTC 


Val 
CTG 


Glu 
CAA 


Gly 
GCG 


Ser 
ACC 


Leu 
CTT 


Leu 
CTT 


Gin 
CAC 


Cly 
CCA 


Thr 
ACA 


1,548 


Glu 
GAG 


«r 

Cly 
CCA 


Ala 
CCG 


lie 
ATT 


Lys 
AAC 


Trp 
TGC 


Asn 
AAT 


Glu 
GAA 


Ala 
CCA 


Asn 
AAC 


Arg 
AGA 


Pro 
CCT 


Cly 
CCA 


Lys 
AAA 


Val 
CTT 


Pro 
CCC 


Phe 
TTT 


Leu 
CTG 


1,566 


Arg 
ACA 


val 
GTA 


Ala 

CCA 


inr 
ACA 


iilU 
GAA 


oer 
ACC 


oer 
TCT 


GCA 


Lys 
AAC 


Tlii- 

1 41 1 

ACT 


Pro 
CCC 


Scr 
TCC 


uya 
AAC 


Leu 

CTA 


TTG 


Asp 
CAT 


Pro 
CCT 


Leu 
CTT 


1,584 


Ala 
GCT 


Trp 
TGC 


Asp 
GAT 


Asn 
AAC 


His 
CAC 


Tyr 
TAT 


Gly 
CCT 


Thr 
ACT 


Gin 
CAG 


He 
ATA 


Pro 
CCA 


Lys 
AAA 


Clu 

CAA 


Clu 
GAC 


Trp 
TGG 


Lys 
AAA 


Ser 
TCC 


Cln 

CAA 


1,602 


Clu 

GAG 


Lys 
AAC 


Ser 
TCA 


Pro 
CCA 


Clu 
CAA 


Lys 
AAA 


Thr 
ACA 


Ala 
GCT 


Phe 
TTT 


Lys 
AAC 


Lys 
AAA 


Lys 
AAG 


Asp 
CAT 


Thr 
ACC 


Tie 
ATT 


Leu 
TTC 


5«r 
TCC 


Leu 

CTG 


1,520 


Asn 
AAC 


Ala 
GCT 


Cys 
TCT 


Clu 
GAA 


Ser 
AGC 


Asn 
AAT 


His 
CAT 


Ala 
CCA 


Tie 
ATA 


Ala 
CCA 


Ala 
CCA 


lie 
ATA 


Asn 
AAT 


Clu 
CAC 


Gly 
CCA 


Cln 
CAA 


Asn 
AAT 


Lys 
AAC 


1,638 


Pro 
CCC 

ir 


Clu 
GAA 


lie 
ATA 


Glu 
GAA 


Val 
CTC 


Thr 
ACC 


Trp 
TCG 


Ala 
GCA 


Lys 
AAC 


Cln 
CAA 


Cly 
CCT 


Arg 

ACC 


Thr 
ACT 


Clu 
CAA 


Arg 

ACC 


Leu 
CTC 


Cys 
TCC 


Scr 
TCT 


1,656 


Cln 
CAA 


Asn 
AAC 


Pro 
CCA 


Pro 
CCA 


V,il 

CTC 


Leu 
TTC 


Lys 
AAA 


Arg 
CCC 


His 
CAT 


Cln 
CAA 


Arg 

CCC 


Gits 

CAA 


Jiff 

ATA 


7hr 
ACT 


Ar* 

CCT 


Thr 

ACT 


Hir 

ACT 


Leu 
CTT 
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Cln 
CAC 


Scr 
TCA 


Asp 
GAT 


Cln 
CAA 


Glu 
GAG 


Clu 
GAA 


lie 
ATT 


Asp 
CAC 


Tyr 
TAT 


Lys 
AAG 


Clu 
GAA 


Asp 
GAT 


The 
TTT 


Asp 
GAC 


He 
ATT 


Tyr 
TAT 


Asp 
CAT 


Glu 
GAG 


Cln 
CAA 


Lys 
AAG 


Lys 
AAA 


Thr 
ACA 


Arg 
CGA 


His 
CAC 


Tyr 
TAT 


Phe 
TTT 


He 
ATT 


Gly 
CCC 


MET 
ATC 


Ser 
ACT 


Ser 
ACC 


Ser 
TCC 


Pro 
CCA 


His 
CAT 


Val 
GTT 


Leu 
CTA 


Pro 
CCT 


Gin 
CAG 


Phe 
TTC 


Lys 
AAG 


Lys 
AAA 


Val 
GTT 


Val 
CTT 


Phe 
TTC 


Gin 
CAC 


Pro 
CCC 


Leu 
TTA 


Tyr 
TAC 


Arg 
CCT 


Gly 
GGA 


Glu 
GAA 


Leu 
CTA 


Asn 
AAT 


Glu 
GAA 


Arg 
ACA 


Ala 
CCA 


Glu 
GAA 


Val 
CTT 


Glu 
CAA 


Asp 
CAT 


Asn 
AAT 


lie 
ATC 


MET 
ATG 


Pro 
CCC 


Tyr 
TAT 


Ser 
TCC 


Phe 
TTC 


Tyr 
TAT 


Ser 
TCT 


Ser 
AGC 


Leu 
CTT 


lie 
ATT 


Ala 
CCA 


Clu 
GAA 


Pro 
CCT 


Arg 
ACA 


Lys 
AAA 


Asn 
AAC 


Phe 
TTT 


Val 
CTC 


Lys 
AAG 


Lys 
AAA 


Val 
CTG 


Gin 
CAA 


His 
CAT 


His 
CAT 


MET 
ATG 


Ala 

CCA 


Pro m 
CCC * 


Thr 
ACT 


Ala 
GCT 


Tyr 
TAT 


Phe 
TTC 


Ser 
TCT 


Asp 
GAT 


Val 
GTT 


Asp 
CAC 


Leu 
CTG 


Glu 
GAA 


Pro 
CCC 


Leu 
CTT 


Leu 
CTG 


Val 
CTC 


Cys 
TGC 


His 
CAC 


Thr 
ACT 


Asn 
AAC 


Thr 
ACA 


Thr 
ACA 


Val" 
CTA 


Gin 
CAG 


Glu 
GAA 


Phe 
TTT 


Ala 
GCT 


Leu 
CTG 


Phe 
TTT 


Phe 
TTC 


Thv 
TAC 


Phe 
TTC 


Thr 
ACT 


Glu 
CAA 


Asn 
AAT 


MET 
ATC 


Clu 
CAA 


Arg 
ACA 


Asn 
AAC 


Glu 
GAA 


Asp 
GAT 


Pro 
CCC 


Thr 
ACT 


Phe 
TTT 


Lys 
AAA 


Glu 
GAG 


Asn 
AAT 


Thr 
TAT 


MET 
ATG 


Asp 
CAT 


Thr 
ACA 


Leu 
CTA 


Pro 
CCT 


Gly 
CCC 


Leu 
TTA 


Val 
GTA 


MET 
ATC 


Leu 
CTC 


Leu 
CTC 


Ser 
ACC 


MET 
ATC 


Gly 
GGC 


Ser 
AGC 


Asn 
AAT 


Glu 
CAA 


Asn 
AAC 


Val 
CTC 


Phe 
TTC 


Thr 
ACT 


Val 
CTA 


Arg 

CCA 


Lyr> 
AAA 


Lys 
AAA 


Glu 
CAC 


Glu 
GAG 


Pea 

CCA 


CUy 
GGT 


Val 
CTT 


Phe 
TTT 


Glu. 
CAC 


Thx 
ACA 


Val 
GTG 


Glu. 
GAA 


HILT 
ATG 
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Asp 
CAT 


Asp 
CAT 


Thr 

ACC 


lie 

A T A 

ATA 


S«r 
TCA 


Val 

GTT 


Glu 
GAA 


MET 
ATG 


Lys 

A Af* 

AAG 


1,692 


Asp 
GAT 


Glu 
CAA 


Asn 
AAT 


Gin 
CAG 


Scr 
ACC 


Pro 
CCC 


Arg 
CGC 


Scr 
AGC 


Phe 
TTT 


1,710 


Ala 
GCT 


Ala 
GCA 


Val 
GTG 


Glu 
GAG 


Arg 
AGG 


Leu 
CTC 


Trp 
TGC 


Asp 
CAT 


Tyr 

TAT 
lAl 


1,728 


Arg 
ACA 


Asn 
AAC 


Arg 
AGG 


Ala 
CCT 


Gin 
CAC 


Ser 
AGT 


Gly 
CGC 


Ser 
Aul 


Val 


1,746 


Glu 
GAA 


Phe 
TTT 


Thr 
ACT 


Asp 
GAT 


Gly 
GGC 


Scr 
TCC 


Phe 
TTT 


Thr 

tor 
Awl 


Cln 


1,764 


His 
CAT 


Leu 
TTG 


Gly 
GGA 


Leu 
CTC 


Leu 
CTC 


Gly 
GGG 


Pro 

CCA 


Tyr 

TAT 


He 

ATA 
AAA 


1,782 


Val 
GTA 


Thr 
ACT 


Phe 
TTC 


Arg 
AGA 


Asn 

A A T 


Gin 

/■* A/* 


Ala 


Ser 
TCT 


Arg 
CCT 


1,800 


Ser 
TCT 


Tyr 
TAT 


Glu 
GAG 


Glu 
GAA 


Asp 
GAT 


Cln 
CAG 


Arg 

AU(* 


Gin 


Gly 
f*rt a. 

uyn 


1,318 


Pro 
CCT 


Asn 
AAT 


Glu 
GAA 


Thr 
ACC 


Lys 

AAA 


Thr 

ACT 


Tyr 

TAC 


Phe 
TTT 


Trp 

lull 


1,836 


Lys 
AAA 


Asp 
GAT 


Glu 
GAG 


Phe 
TTT 


Asp 
GAC 


cys 
TGC 


Lys 
AAA 


Ala 


Tr? 


1,854 


Lys 
AAA 


Asp 
CAT 


Val 
GTG 


His 
CAC 


Ser 
TCA 


Gly 
CCC 


Leu 
CTG 


He 

ATT 

ATT 


Gly 

Oun 


1,872 


Leu 
CTG 


Asn 
AAC 


Pro 
CCT 


Ala 
CCT 


Kls 
CAT 


Gly 

gc*g 


Arg 

AOA 


Cln 

f* A" A 


Val 


1,890 


Thr 
ACC 


He 
ATC 


Phe 
TTT 


Asp 

CAT. 


Glu 
GAG 


Thr 
ACC 


Lys 
AAA 


Ser 

Aft* 


Trp 

TC.tl 


1,908 


Cys 
TGC 


Arg 
AGC 


Ala 
CCT 


Pro 
CCC 


Cys 
TGC 


Asn 
AAT 


He 
ATC 


Gin 
CAG 


MET 
ATC 


1,926 


Arg 
CCC 


Phe 
TTC 


His 
CAT 


Ala 

GGA 


He 
ATC 


Asn 
AAT 


Gly 
CCC 


Tyr 
TAC 


lie 
ATA 


1,944 


Ala 
GCT 


Gin 
CAC 


Asp 
GAT 


Gin 
CAA 


Arg 
AGG 


He 
ATT 


Arg 
CGA 


Trp 

TCG 


Tyr 

TAT 


1,962 


II* 
ATC 


Hfs 
CAT 


Ser 
TCT 


He 
ATT 


His 
CAT 


Plie 
TTC 


Ser 
ACT 


Gly 
GGA 


Hts 
CAT 


1,980 


Tyr 
TAT 


Lys 
AAA 


MET 
ATC 


Ala 
CCA 


Leu 
CTC 


Tyr 
TAC 


Asn 
AAT 


Leu 
CTC 


Tyr 
TAT 


1,998 


Leu. 
TTA 


Pro 
CCA 


Sec 

tec 


Lys. 
AAA 


M:i 

GCT 


CI* 
GCtA 


Lie 
ATT 


Trp 
TCC 


Arg 

CCG 


2.016 



OMPI 
vy^ WIFO 
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Val 
GTG 


Glu 
GAA 


Cys 
TCC 


Leu 
CTT 


lie 
ATT 


Gly 
GCC 


Glu 
GAG 


His 
CAT 


Leu 
CTA 


His 
CAT 


Ala 
GCT 


Gly 

g<;g 


MET 
ATG 


Ser 
ACC 


Thr 
ACA 


Leu 
CTT 


Phe- 
TTT 


Leu 
CTG 


2,034 


Val 
GTG 


Tyr 
TAG 


Ser 
AGC 


Asn 
AAT 


Lys 
AAG 


Cys 
TGT 


Gin 
CAG 


Thr 
ACT 


Pro 
CCC 


Leu 
CTG 


Cly 
GGA 


MET 
ATG 


Ala 
GCT 


Ser 
TCT 


Gly 
GGA 


His 
CAC 


He 
ATT 


Arg 
AGA 


2,052 


Asp 
GAT 


Phe 
TTT 


Gin 
GAG 


lie 
ATT 


Thr 
ACA 


Ale 
GCT 


Scr 
TCA 


Cly 
CCA 


Gin 
CAA 


Tyr 
TAT 


Gly 
GCA 


Gin 
CAG 


Trp 
TGG 


Ala 
GCC 


Pro 
CCA 


Lys 
AAC 


Leu 
CTC 


Ala 
GCC 


2,070 


Arg 
AGA 


Leu 
CTT 


His 
CAT 


Tyr 
TAT 


Ser 
TCC 


Gly 
GGA 


Ser 
TCA 


He 
ATC 


Asn 
AAT 


Ala 
GCC 


Trp 
TCG 


Ser 
AGC 


Thr 
ACC 


Lys 
AAC 


Glu 
GAG 


Pro 
CCC 


Phe 
TTT 


Ser 
TCT 


2,088 


Trp 
TGG 


lie 
ATC 


Lys 
AAG 


Val 
GTG 


Asp 
GAT 


Leu 
CTG 


Leu 
TTG 


Ala 
GCA 


Pro 
CCA 


MET 
ATG 


He 
ATT 


He 
ATT 


His 
CAC 


Gly 
GGC 


He 
ATC 


Lys 
AAG 


Thr 
ACC 


Gin 
CAG 


2,106 


Gly 
GGT 


Ala 
GCC 


Arg 
CCT 


Gla 
CAG 


Lys 
AAG 


Phe 
TTC 


Ser 
TCC 


Ser 
ACC 


Leu 
CTC 


Tyr 
TAC 


He 
ATC 


Ser 
TCT 


Gin 
CAG 


Phe 
TTT 


He 
ATC 


He 
ATC 


MET 
ATG 


Tyr 
TAT 


2,124 


Ser 
AGT 


Leu 
CTT 


Asp 
GAT 


Gly 
GGG 


Lys 
AAG 


Lys 
AAG 


Trp 
TGG 


Gin 
CAG 


Thr 
ACT 


Tyr 
TAT 


Arg 
CGA 


Gly 
GCA 


Asn 
AAT 


Ser 
TCC 


Thr 
ACT 


Gly 
GGA 


Thr 
ACC 


Leu 
TTA 


2,142 


MET 
ATG 


Val 
GTC 


Phe 
TTC 


Phe 
TTT 


Gly 
GGC 


Asn 
AAT 


Val 
CTC 


Asp 
CAT 


Ser 
TCA 


Ser 
TCT 


Cly 
CCG 


He 
ATA 


Lys 
AAA 


His 
CAC 


Asn 
AAT 


lie 
ATT 


Phe 
TTT 


Asn 
AAC 


2,160 


Pro 
CCT 


Pro 
CCA 


lie 
ATT 


He 
ATT 


Ala 
CCT 


Arg 
CCA 


Tyr 
xTAC 


He 
ATC 


Arg 
CCT 


Leu 
TTC 


His 
CAC 


Pro 
CCA 


Thr 
ACT 


His 
CAT 


Tyr 
TAT 


Ser 
ACC 


He 
ATT 


Arg 
CCC 


2,178 


Ser 
AGC 


Thr 
ACT 


Leu 
CTT 


Arg 
CGC 


MET 
AIC 


Glu 
GAG 


Leu 
TTG 


MET 
ATG 


Wy 
CCC 


Cys 
TCT 


Asp 
GAT 


Leu 
TTA 


Asu 
AAT 


Ser 
ACT 


Cys 
TCC 


Ser 
ACC 


MET 
ATG 


Pro 
CCA 


2,196 


Leu 
TTG 


Cly 
GCA 


MET 
ATG 


Glu 
GAG 


Ser 
AGT 


Lys 
AAA 


Ala 
CCA 


He 
ATA 


Ser 
TCA 


Asp 
GAT 


Ala 
GCA 


Cln 
CAG 


He 
ATT 


Thr 
ACT 


Ala 
GCT 


Ser 
TCA 


Ser 
TCC 


Tyr 
TAC 


2,214 


Phe 
TTT 


Thr 
AGO 


Asn 
AAT 


MET 
ATG 


Phe 
TTT 


Ala 

GCC 


Thr 
ACC 


Trp 
TGG 


Ser 
TCT 


Pro 
CCT 


Ser 
TCA 


Lys 
AAA * 


Ala 
GCT 


Arg 
CGA 


Leu 
CTT 


His 
CAC 


Leu 
CTC 


Cln 
CAA 


2,232 


Gly 
GCG 


Arg 
AGG 


Ser 
AGT 


Asn 
AAT 


Ala 
GCC 


Trp 
TGG 


Arg 
AGA 


Pro 
CCT 


Gin 
CAG 


Val 
GTC 


Asn 
AAT 


Asn 
AAT 


Pro 
CCA 


Lys 
AAA 


Glu 
GAG 


Trp 
TCG 


Leu 
CTC 


Cln 
CAA 


2,250 


Val 
GTG 


Asp 
GAC 


Phe 
TTC 


Gin 
CAG 


Lys 
AAG 


Thr 
ACA 


MET 
ATG 


Lys 
AAA 


Val 
CTC 


Thr 
ACA 


Gly 
GGA 


Val 
GTA 


Thr 
ACT 


Thr 
ACT 


Cln 
CAG 


Cly 
CGA 


Val 
GTA 


Lys 
AAA 


2,268 


Sar 
TCT 


Leu 
CTG 


Leu 
CTT 


Thr 
ACC 


Ser 
ACC 


MET 
ATG 


Tyr 
TAT 


Val 
CTG 


Lys 
AAG 


CJu 
GAG 


Phe 
TTC 


Leu 
CTC 


He 
ATC 


Scr 
TCC 


Ser 
ACC 


Scr 
AGT 


Cln 
CAA 


Asp 
CAT 


2,286 


Gly 
GGC 


His 
CAT 


Gin 
CAG 


Xrp 
TGG 


Thr 
ACT 


Leu 
CTC 


Phe 
TTT 


Phe 
TTT 


Cln 
CAG 


Asn 
AAT 


Cly 
GCC 


Lys 
AAA 


Val 
CTA 


Lys 
AAC 


Val 
CTT 


Phe 
TTT 


Cln 
CAG 


Cly 
CGA 


2,304 


Asn 
AAT 


Gin 
GAA 


Asp 
GAC 


Scr 
TCC 


Phe 
TTC 


Thr 
ACA 


Pro 
CCT 


Val 
CTG 


Val 
CTG 


Asn 
AAC 


Sor 
TCT 


Leu 
CTA 


Asp 
CAC 


Pro 
CCA 


Pro 
CCG 


Leu 
TTA 


Leu 
CTC 


Thr 
ACT 


2,322 


Arg 
CGC 


Tyr 
TAC 


Leu 
CTT 


Arg 
CGA 


Tie 
ATT 


His 
CAC 


Pro 

CCC 


Gin 
CAG 


Ser 
AGT 


Trp 
TGG 


Val 

CTG 


His 
CAC 


Gin 
CAG 


He 
ATT 


Ala 
CCC 


Leu 
CTG 


Arg 
ACC 


MKT 
ATG 


2,340 
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- ' 2,352 

Glu Val Leu Cly Cys Glu Ala Gin Asp Leu Tyr End 

CAG GTT CTG CCC TGC GAG GCA CAG GAC CTC TAG TCA CGGTCCCCACTCCATCCCACCTCCCACTC 

CCGTOUICTCTCCCTCCTCACCTC^ 
AACCCTCCTGAAITAACTATCATCACTCCTCCATTSC^ 

TTCTGCAGCTGCTCCCAGA 
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